Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transforming the unit of a messy height variable to meters

    Hi All,

    I have two variables that provide me with information on height of a specific individual.
    The first variable Q69_1_1 tells me the unit of the measurement, while the 2nd variable Q69_1_1_TEXT gives me the measurement. I want to transform everything in the latter variable into meters (this is where it gets messy). Q69_1_1_TEXT contains words, and Q69_1_1 is not always correct with the unit of measurement (e.g. first row indicated as inches while it is a combination of feet and inches, similarly third row is indicated as feet but it is a combination of feet and inches). I have about 600 observations, I'd rather not to have to transform them one by one but if that is my only option I will do it. I think the example below contains a fair representation of everything in my file.

    What would you do?
    Thank you for your time and help

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double Q69_1_1 strL Q69_1_1_TEXT
    1 "5'2"             
    2 "5"               
    2 "5 feet, 6 inches"
    2 "5'3"             
    1 "73"              
    3 "157"             
    2 "5.2"             
    2 "5'4"             
    3 "164"             
    2 `"5'3""'          
    3 "165"             
    3 "167"             
    3 "167"             
    4 "1,74"            
    1 "65"              
    2 "5.5"             
    . `" 5'10""'        
    1 "5'4"             
    4 "1.57"            
    2 "5'2"             
    1 "56"              
    1 "61"              
    2 `"5'2""'          
    2 "4'11"            
    2 "5'4"             
    1 "58"              
    end
    label values Q69_1_1 Q69_1_1
    label def Q69_1_1 1 "Inches", modify
    label def Q69_1_1 2 "Feet", modify
    label def Q69_1_1 3 "Centimetres", modify
    label def Q69_1_1 4 "Metres", modify

  • #2
    I would use moss (from SSC) to extract the two sets of numbers and then convert by type. This should get you started:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double Q69_1_1 strL Q69_1_1_TEXT
    1 "5'2"             
    2 "5"               
    2 "5 feet, 6 inches"
    2 "5'3"             
    1 "73"              
    3 "157"             
    2 "5.2"             
    2 "5'4"             
    3 "164"             
    2 `"5'3""'          
    3 "165"             
    3 "167"             
    3 "167"             
    4 "1,74"            
    1 "65"              
    2 "5.5"             
    . `" 5'10""'        
    1 "5'4"             
    4 "1.57"            
    2 "5'2"             
    1 "56"              
    1 "61"              
    2 `"5'2""'          
    2 "4'11"            
    2 "5'4"             
    1 "58"              
    end
    label values Q69_1_1 Q69_1_1
    label def Q69_1_1 1 "Inches", modify
    label def Q69_1_1 2 "Feet", modify
    label def Q69_1_1 3 "Centimetres", modify
    label def Q69_1_1 4 "Metres", modify
    
    * extract numbers using moss (from SSC)
    moss Q69_1_1_TEXT, match("([0-9]+)") regex
    drop _pos*
    destring(_m*), replace
    replace _match2 = 0 if mi(_match2)
    
    * convert to meters
    gen meters = _match1 + _match2/100 if Q69_1_1 == 4
    replace meters = _match1 / 100 if Q69_1_1 == 3
    replace meters = (_match1 * 12 + _match2) * 2.54 / 100 if Q69_1_1 == 2
    replace meters = _match1 * 2.54 / 100 if Q69_1_1 == 1
    
    * raw results
    sort Q69_1_1 Q69_1_1_TEXT
    list, sepby(Q69_1_1)
    
    * problem cases
    list if !inrange(meters,1,2)

    Comment


    • #3
      thank you Robert, this is very helpful!
      Patrick

      Comment

      Working...
      X