Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting String height variable to numeric inches

    HI All, my first post using dataex, so here it goes. Just trying to convert a string height variable to byte, in inches. You can see below how it's formatted. Any clues on how to approach this problem? Thanks so much, and I apologize if I put the sample data in the wrong spot. I will do better next time!
    Ben

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str12 height
    "5ft. 10.7in."
    "5ft. 11.2in."
    "5ft. 11.8in."
    "5ft. 10.1in."
    ""            
    "5ft. 07.3in."
    "5ft. 07.8in."
    "6ft. 02.4in."
    "5ft. 09.7in."
    "6ft. 00.3in."
    "5ft. 08.7in."
    ""            
    ""            
    "5ft. 11.1in."
    "5ft. 11.7in."
    "6ft. 02.8in."
    "6ft. 01.2in."
    ""            
    "5ft. 09.8in."
    "5ft. 08.4in."
    end

  • #2
    try this:
    Code:
    gen htinch=(real(substr(height,1,1))*12) + (real(substr(height,6,4)))
    note, however, that this relies on all the data looking like your example data, including no typo's

    note also that this is float, not bye, as you have fractions of inches

    Comment


    • #3
      Wow, that worked like a charm! Thanks so much! Yes, good point about the float. Man, can't thank you enough. I don't completely understand how this works, but will dissect this command.

      Thanks again for the prompt and expert advice.

      Ben

      Comment


      • #4
        Also note that this will only work as long as your height does not exceed 9 feet and 11.9 inches. Otherwise the first substring will not return the correct number of feet.

        This may not be as pretty, as the approach suggested by Mr. Goldstein, but it would work if you have to deal with the problem that I pointed out (I added a 21st observation that exceeds the maximum height pointed out to illustrate):

        Code:
        clear
        input str12 height
        "5ft. 10.7in."
        "5ft. 11.2in."
        "5ft. 11.8in."
        "5ft. 10.1in."
        ""            
        "5ft. 07.3in."
        "5ft. 07.8in."
        "6ft. 02.4in."
        "5ft. 09.7in."
        "6ft. 00.3in."
        "5ft. 08.7in."
        ""            
        ""            
        "5ft. 11.1in."
        "5ft. 11.7in."
        "6ft. 02.8in."
        "6ft. 01.2in."
        ""            
        "5ft. 09.8in."
        "5ft. 08.4in."
        "11ft. 10.1in."
        end
        
        split height, parse(" ")
        rename height1 ft
        rename height2 inches
        
        gen htinch = .
        replace htinch = real(substr(ft,1,1)) * 12 + real(substr(inches, 1,4)) if strlen(ft) == 4
        replace htinch = real(substr(ft,1,2)) * 12 + real(substr(inches, 1,4)) if strlen(ft) == 5
        
        drop ft inches

        This logic can easily be extended to 3+ digits, too!

        Comment


        • #5
          This of course tempted me into creating a solution using Stata's Unicode regular expression functions.
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str15 height
          "5ft. 10.7in."
          "10ft. 11.2in."
          "0ft. 66in."
          "66in."
          "5.5ft."
          "5.5ft. 1in."
          "0ft. 0in."
          ""    
          "whatever"        
          end
          cls
          generate htinch = real(ustrregexs(1))*12 +real(ustrregexs(2)) if ustrregexm(height,"^([\d\.]+)ft. ([\d\.]+)in.$")
          replace htinch  = real(ustrregexs(1))                         if ustrregexm(height,"^([\d\.]+)in.$")
          replace htinch  = real(ustrregexs(1))*12                      if ustrregexm(height,"^([\d\.]+)ft.$")
          list, clean
          Code:
          . list, clean
          
                        height   htinch  
            1.    5ft. 10.7in.     70.7  
            2.   10ft. 11.2in.    131.2  
            3.      0ft. 66in.       66  
            4.           66in.       66  
            5.          5.5ft.       66  
            6.     5.5ft. 1in.       67  
            7.       0ft. 0in.        0  
            8.                        .  
            9.        whatever        .

          Comment


          • #6
            This is great, Gents! Thanks a bunch for this. Thankfully, there's no one in my sample 10 feet tall, but this code will be super helpful going forward. Can't thank you enough for your time/expertise.

            Comment

            Working...
            X