Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is there an elegant way to assign midpoints to data?

    Hi Statalist,

    I need to assign midpoints to each band of the household's gross total income. I understand how midpoints work (max - min)/2. However, I am not sure whether doing it manually is the most elegant solution. Is there a command in Stata that can do it for me ?

    Here is the excerpt of the data:

    DV: Gross |
    household income |
    Freq. Percent Cum.
    ------------------------+-----------------------------------
    Less than �520 | 17 0.14 0.14
    �520 less than �1040 | 17 0.14 0.29
    �1040 less than �1560 | 26 0.22 0.51
    �1560 less than �2080 | 47 0.40 0.91
    �2080 less than �2600 | 61 0.52 1.43
    �2600 Less than �3120 | 40 0.34 1.77
    �3120 less than �3640 | 41 0.35 2.12
    �3640 less than �4160 | 61 0.52 2.64
    �4160 less than �4680 | 67 0.57 3.21
    �4680 less than �5200 | 98 0.84 4.05
    �5200 Less than �6240 | 326 2.78 6.83
    �6240 less than �7280 | 340 2.90 9.73
    �7280 less than �8320 | 379 3.23 12.96
    �8320 less than �9360 | 355 3.03 15.98
    �9360 less than �10400 | 455 3.88 19.86
    �10400 less than �11440 | 452 3.85 23.71
    �11440 less than �12480 | 400 3.41 27.12
    �12480 less than �13520 | 442 3.77 30.89
    �13520 less than �14560 | 342 2.92 33.81
    �14560 less than �15600 | 408 3.48 37.29
    �15600 less than �16640 | 328 2.80 40.08
    �16640 less than �17680 | 283 2.41 42.49
    �17680 less than �18720 | 271 2.31 44.80
    �18720 less than �19760 | 319 2.72 47.52
    �19760 less than �20800 | 365 3.11 50.64
    �20800 less than �23400 | 617 5.26 55.89
    �23400 less than �26000 | 616 5.25 61.15
    �26000 less than �28600 | 513 4.37 65.52
    �28600 less than �31200 | 523 4.46 69.98
    �31200 less than �33800 | 559 4.77 74.74
    �33800 less than �36400 | 466 3.97 78.71
    �36,400 or more | 2,497 21.29 100.00
    ------------------------+-----------------------------------
    Total | 11,731 100.00
    Last edited by Sofiya Volvakova; 03 Jun 2023, 09:22.

  • #2
    If possible, can you provide us with some data via -dataex-?

    Comment


    • #3
      Originally posted by Tiago Pereira View Post
      If possible, can you provide us with some data via -dataex-?
      Unfortunately, I cannot share the dataset and copying/pasting the variable into the new dataset converted it from string to numeric. I don't need the exact code for Stata. I just want to know what would be the appropriate Stata syntax based on the data excerpt I provided above. Thanks, Tiago

      Comment


      • #4
        Tiago Pereira is right that we need a data example. It appears that your start-points and end-points are in the value labels, so you can use decode combined with other string functions and generate the wanted variable.

        Comment


        • #5
          Originally posted by Andrew Musau View Post
          Tiago Pereira is right that we need a data example. It appears that your start-points and end-points are in the value labels, so you can use decode combined with other string functions and generate the wanted variable.
          ----------------------- copy starting from the next line -----------------------
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input byte W1inc1est
           26
           32
           26
          -99
           20
           21
           26
           26
           32
          -92
          -92
           27
           15
           30
           15
           12
           32
           -1
           12
           17
           27
          -92
           31
           17
           16
           -1
           -1
           28
           14
           32
           31
           11
          -92
           17
           -1
           -1
           -1
           21
           -1
           28
           21
           12
           -1
           16
           -1
          -92
           32
           32
           14
           27
           11
           14
           23
           32
           14
          -92
          -92
          -92
           29
           26
           21
           -1
           26
           29
          -92
           16
            9
          -99
           16
           28
           32
           23
           17
           31
           28
           32
           -1
           32
           24
           17
           18
          -92
          -92
           31
           32
           32
           32
           31
          -92
           14
           18
           18
           20
           14
           19
           25
           16
           32
           -1
           26
          end
          label values W1inc1est W1inc1est
          label def W1inc1est -99 "MP not interviewed", modify
          label def W1inc1est -92 "Refused", modify
          label def W1inc1est -1 "Don't know", modify
          label def W1inc1est 9 "�4160 less than �4680", modify
          label def W1inc1est 11 "�5200 Less than �6240", modify
          label def W1inc1est 12 "�6240 less than �7280", modify
          label def W1inc1est 14 "�8320 less than �9360", modify
          label def W1inc1est 15 "�9360 less than �10400", modify
          label def W1inc1est 16 "�10400 less than �11440", modify
          label def W1inc1est 17 "�11440 less than �12480", modify
          label def W1inc1est 18 "�12480 less than �13520", modify
          label def W1inc1est 19 "�13520 less than �14560", modify
          label def W1inc1est 20 "�14560 less than �15600", modify
          label def W1inc1est 21 "�15600 less than �16640", modify
          label def W1inc1est 23 "�17680 less than �18720", modify
          label def W1inc1est 24 "�18720 less than �19760", modify
          label def W1inc1est 25 "�19760 less than �20800", modify
          label def W1inc1est 26 "�20800 less than �23400", modify
          label def W1inc1est 27 "�23400 less than �26000", modify
          label def W1inc1est 28 "�26000 less than �28600", modify
          label def W1inc1est 29 "�28600 less than �31200", modify
          label def W1inc1est 30 "�31200 less than �33800", modify
          label def W1inc1est 31 "�33800 less than �36400", modify
          label def W1inc1est 32 "�36,400 or more", modify
          ------------------ copy up to and including the previous line ------------------

          Listed 100 out of 15770 observations
          Use the count() option to list more

          Comment


          • #6
            The best recipe for a midpoint is (max + min) / 2 as (max - min) / 2 is half the range, a different beast.

            Comment


            • #7
              Originally posted by Nick Cox View Post
              The best recipe for a midpoint is (max + min) / 2 as (max - min) / 2 is half the range, a different beast.
              Thank you for correcting, Nick! Is there a command for Stata to tell it to derive a midpoint for those income bands? Or will I have to do it manually with the "recode" function, e.g recode W1inc1est (1=250) (2=500), etc.?

              Comment


              • #8
                At a quick glance your bins vary in width, so you will waste time trying to code them any way but directly.

                Comment


                • #9
                  There are no mid-points for intervals defined as \(amount>X\). So I just return the amount below.

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input byte W1inc1est
                   26
                   32
                   26
                  -99
                   20
                   21
                   26
                   26
                   32
                  -92
                  -92
                   27
                   15
                   30
                   15
                   12
                   32
                   -1
                   12
                   17
                   27
                  -92
                   31
                   17
                   16
                   -1
                   -1
                   28
                   14
                   32
                   31
                   11
                  -92
                   17
                   -1
                   -1
                   -1
                   21
                   -1
                   28
                   21
                   12
                   -1
                   16
                   -1
                  -92
                   32
                   32
                   14
                   27
                   11
                   14
                   23
                   32
                   14
                  -92
                  -92
                  -92
                   29
                   26
                   21
                   -1
                   26
                   29
                  -92
                   16
                    9
                  -99
                   16
                   28
                   32
                   23
                   17
                   31
                   28
                   32
                   -1
                   32
                   24
                   17
                   18
                  -92
                  -92
                   31
                   32
                   32
                   32
                   31
                  -92
                   14
                   18
                   18
                   20
                   14
                   19
                   25
                   16
                   32
                   -1
                   26
                  end
                  label values W1inc1est W1inc1est
                  label def W1inc1est -99 "MP not interviewed", modify
                  label def W1inc1est -92 "Refused", modify
                  label def W1inc1est -1 "Don't know", modify
                  label def W1inc1est 9 "�4160 less than �4680", modify
                  label def W1inc1est 11 "�5200 Less than �6240", modify
                  label def W1inc1est 12 "�6240 less than �7280", modify
                  label def W1inc1est 14 "�8320 less than �9360", modify
                  label def W1inc1est 15 "�9360 less than �10400", modify
                  label def W1inc1est 16 "�10400 less than �11440", modify
                  label def W1inc1est 17 "�11440 less than �12480", modify
                  label def W1inc1est 18 "�12480 less than �13520", modify
                  label def W1inc1est 19 "�13520 less than �14560", modify
                  label def W1inc1est 20 "�14560 less than �15600", modify
                  label def W1inc1est 21 "�15600 less than �16640", modify
                  label def W1inc1est 23 "�17680 less than �18720", modify
                  label def W1inc1est 24 "�18720 less than �19760", modify
                  label def W1inc1est 25 "�19760 less than �20800", modify
                  label def W1inc1est 26 "�20800 less than �23400", modify
                  label def W1inc1est 27 "�23400 less than �26000", modify
                  label def W1inc1est 28 "�26000 less than �28600", modify
                  label def W1inc1est 29 "�28600 less than �31200", modify
                  label def W1inc1est 30 "�31200 less than �33800", modify
                  label def W1inc1est 31 "�33800 less than �36400", modify
                  label def W1inc1est 32 "�36,400 or more", modify
                  
                  decode W1inc1est, gen(newvar)
                  gen tosplit= trim(itrim(ustrregexra(newvar, "[^\d,]", " ")))
                  split tosplit, g(limit) ignore(",") destring
                  gen wanted= cond(missing(limit2), limit1, (limit1+limit2)/2)
                  drop newvar tosplit
                  Res.:

                  Code:
                  . l, sep(0)
                  
                       +----------------------------------------------------+
                       |               W1inc1est   limit1   limit2   wanted |
                       |----------------------------------------------------|
                    1. | �20800 less than �23400    20800    23400    22100 |
                    2. |         �36,400 or more    36400        .    36400 |
                    3. | �20800 less than �23400    20800    23400    22100 |
                    4. |      MP not interviewed        .        .        . |
                    5. | �14560 less than �15600    14560    15600    15080 |
                    6. | �15600 less than �16640    15600    16640    16120 |
                    7. | �20800 less than �23400    20800    23400    22100 |
                    8. | �20800 less than �23400    20800    23400    22100 |
                    9. |         �36,400 or more    36400        .    36400 |
                   10. |                 Refused        .        .        . |
                   11. |                 Refused        .        .        . |
                   12. | �23400 less than �26000    23400    26000    24700 |
                   13. |  �9360 less than �10400     9360    10400     9880 |
                   14. | �31200 less than �33800    31200    33800    32500 |
                   15. |  �9360 less than �10400     9360    10400     9880 |
                   16. |   �6240 less than �7280     6240     7280     6760 |
                   17. |         �36,400 or more    36400        .    36400 |
                   18. |              Don't know        .        .        . |
                   19. |   �6240 less than �7280     6240     7280     6760 |
                   20. | �11440 less than �12480    11440    12480    11960 |
                   21. | �23400 less than �26000    23400    26000    24700 |
                   22. |                 Refused        .        .        . |
                   23. | �33800 less than �36400    33800    36400    35100 |
                   24. | �11440 less than �12480    11440    12480    11960 |
                   25. | �10400 less than �11440    10400    11440    10920 |
                   26. |              Don't know        .        .        . |
                   27. |              Don't know        .        .        . |
                   28. | �26000 less than �28600    26000    28600    27300 |
                   29. |   �8320 less than �9360     8320     9360     8840 |
                   30. |         �36,400 or more    36400        .    36400 |
                   31. | �33800 less than �36400    33800    36400    35100 |
                   32. |   �5200 Less than �6240     5200     6240     5720 |
                   33. |                 Refused        .        .        . |
                   34. | �11440 less than �12480    11440    12480    11960 |
                   35. |              Don't know        .        .        . |
                   36. |              Don't know        .        .        . |
                   37. |              Don't know        .        .        . |
                   38. | �15600 less than �16640    15600    16640    16120 |
                   39. |              Don't know        .        .        . |
                   40. | �26000 less than �28600    26000    28600    27300 |
                   41. | �15600 less than �16640    15600    16640    16120 |
                   42. |   �6240 less than �7280     6240     7280     6760 |
                   43. |              Don't know        .        .        . |
                   44. | �10400 less than �11440    10400    11440    10920 |
                   45. |              Don't know        .        .        . |
                   46. |                 Refused        .        .        . |
                   47. |         �36,400 or more    36400        .    36400 |
                   48. |         �36,400 or more    36400        .    36400 |
                   49. |   �8320 less than �9360     8320     9360     8840 |
                   50. | �23400 less than �26000    23400    26000    24700 |
                   51. |   �5200 Less than �6240     5200     6240     5720 |
                   52. |   �8320 less than �9360     8320     9360     8840 |
                   53. | �17680 less than �18720    17680    18720    18200 |
                   54. |         �36,400 or more    36400        .    36400 |
                   55. |   �8320 less than �9360     8320     9360     8840 |
                   56. |                 Refused        .        .        . |
                   57. |                 Refused        .        .        . |
                   58. |                 Refused        .        .        . |
                   59. | �28600 less than �31200    28600    31200    29900 |
                   60. | �20800 less than �23400    20800    23400    22100 |
                   61. | �15600 less than �16640    15600    16640    16120 |
                   62. |              Don't know        .        .        . |
                   63. | �20800 less than �23400    20800    23400    22100 |
                   64. | �28600 less than �31200    28600    31200    29900 |
                   65. |                 Refused        .        .        . |
                   66. | �10400 less than �11440    10400    11440    10920 |
                   67. |   �4160 less than �4680     4160     4680     4420 |
                   68. |      MP not interviewed        .        .        . |
                   69. | �10400 less than �11440    10400    11440    10920 |
                   70. | �26000 less than �28600    26000    28600    27300 |
                   71. |         �36,400 or more    36400        .    36400 |
                   72. | �17680 less than �18720    17680    18720    18200 |
                   73. | �11440 less than �12480    11440    12480    11960 |
                   74. | �33800 less than �36400    33800    36400    35100 |
                   75. | �26000 less than �28600    26000    28600    27300 |
                   76. |         �36,400 or more    36400        .    36400 |
                   77. |              Don't know        .        .        . |
                   78. |         �36,400 or more    36400        .    36400 |
                   79. | �18720 less than �19760    18720    19760    19240 |
                   80. | �11440 less than �12480    11440    12480    11960 |
                   81. | �12480 less than �13520    12480    13520    13000 |
                   82. |                 Refused        .        .        . |
                   83. |                 Refused        .        .        . |
                   84. | �33800 less than �36400    33800    36400    35100 |
                   85. |         �36,400 or more    36400        .    36400 |
                   86. |         �36,400 or more    36400        .    36400 |
                   87. |         �36,400 or more    36400        .    36400 |
                   88. | �33800 less than �36400    33800    36400    35100 |
                   89. |                 Refused        .        .        . |
                   90. |   �8320 less than �9360     8320     9360     8840 |
                   91. | �12480 less than �13520    12480    13520    13000 |
                   92. | �12480 less than �13520    12480    13520    13000 |
                   93. | �14560 less than �15600    14560    15600    15080 |
                   94. |   �8320 less than �9360     8320     9360     8840 |
                   95. | �13520 less than �14560    13520    14560    14040 |
                   96. | �19760 less than �20800    19760    20800    20280 |
                   97. | �10400 less than �11440    10400    11440    10920 |
                   98. |         �36,400 or more    36400        .    36400 |
                   99. |              Don't know        .        .        . |
                  100. | �20800 less than �23400    20800    23400    22100 |
                       +----------------------------------------------------+
                  
                  .

                  Comment


                  • #10
                    Originally posted by Andrew Musau View Post
                    There are no mid-points for intervals defined as \(amount>X\). So I just return the amount below.

                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input byte W1inc1est
                    26
                    32
                    26
                    -99
                    20
                    21
                    26
                    26
                    32
                    -92
                    -92
                    27
                    15
                    30
                    15
                    12
                    32
                    -1
                    12
                    17
                    27
                    -92
                    31
                    17
                    16
                    -1
                    -1
                    28
                    14
                    32
                    31
                    11
                    -92
                    17
                    -1
                    -1
                    -1
                    21
                    -1
                    28
                    21
                    12
                    -1
                    16
                    -1
                    -92
                    32
                    32
                    14
                    27
                    11
                    14
                    23
                    32
                    14
                    -92
                    -92
                    -92
                    29
                    26
                    21
                    -1
                    26
                    29
                    -92
                    16
                    9
                    -99
                    16
                    28
                    32
                    23
                    17
                    31
                    28
                    32
                    -1
                    32
                    24
                    17
                    18
                    -92
                    -92
                    31
                    32
                    32
                    32
                    31
                    -92
                    14
                    18
                    18
                    20
                    14
                    19
                    25
                    16
                    32
                    -1
                    26
                    end
                    label values W1inc1est W1inc1est
                    label def W1inc1est -99 "MP not interviewed", modify
                    label def W1inc1est -92 "Refused", modify
                    label def W1inc1est -1 "Don't know", modify
                    label def W1inc1est 9 "�4160 less than �4680", modify
                    label def W1inc1est 11 "�5200 Less than �6240", modify
                    label def W1inc1est 12 "�6240 less than �7280", modify
                    label def W1inc1est 14 "�8320 less than �9360", modify
                    label def W1inc1est 15 "�9360 less than �10400", modify
                    label def W1inc1est 16 "�10400 less than �11440", modify
                    label def W1inc1est 17 "�11440 less than �12480", modify
                    label def W1inc1est 18 "�12480 less than �13520", modify
                    label def W1inc1est 19 "�13520 less than �14560", modify
                    label def W1inc1est 20 "�14560 less than �15600", modify
                    label def W1inc1est 21 "�15600 less than �16640", modify
                    label def W1inc1est 23 "�17680 less than �18720", modify
                    label def W1inc1est 24 "�18720 less than �19760", modify
                    label def W1inc1est 25 "�19760 less than �20800", modify
                    label def W1inc1est 26 "�20800 less than �23400", modify
                    label def W1inc1est 27 "�23400 less than �26000", modify
                    label def W1inc1est 28 "�26000 less than �28600", modify
                    label def W1inc1est 29 "�28600 less than �31200", modify
                    label def W1inc1est 30 "�31200 less than �33800", modify
                    label def W1inc1est 31 "�33800 less than �36400", modify
                    label def W1inc1est 32 "�36,400 or more", modify
                    
                    decode W1inc1est, gen(newvar)
                    gen tosplit= trim(itrim(ustrregexra(newvar, "[^\d,]", " ")))
                    split tosplit, g(limit) ignore(",") destring
                    gen wanted= cond(missing(limit2), limit1, (limit1+limit2)/2)
                    drop newvar tosplit
                    Res.:

                    Code:
                    . l, sep(0)
                    
                    +----------------------------------------------------+
                    | W1inc1est limit1 limit2 wanted |
                    |----------------------------------------------------|
                    1. | �20800 less than �23400 20800 23400 22100 |
                    2. | �36,400 or more 36400 . 36400 |
                    3. | �20800 less than �23400 20800 23400 22100 |
                    4. | MP not interviewed . . . |
                    5. | �14560 less than �15600 14560 15600 15080 |
                    6. | �15600 less than �16640 15600 16640 16120 |
                    7. | �20800 less than �23400 20800 23400 22100 |
                    8. | �20800 less than �23400 20800 23400 22100 |
                    9. | �36,400 or more 36400 . 36400 |
                    10. | Refused . . . |
                    11. | Refused . . . |
                    12. | �23400 less than �26000 23400 26000 24700 |
                    13. | �9360 less than �10400 9360 10400 9880 |
                    14. | �31200 less than �33800 31200 33800 32500 |
                    15. | �9360 less than �10400 9360 10400 9880 |
                    16. | �6240 less than �7280 6240 7280 6760 |
                    17. | �36,400 or more 36400 . 36400 |
                    18. | Don't know . . . |
                    19. | �6240 less than �7280 6240 7280 6760 |
                    20. | �11440 less than �12480 11440 12480 11960 |
                    21. | �23400 less than �26000 23400 26000 24700 |
                    22. | Refused . . . |
                    23. | �33800 less than �36400 33800 36400 35100 |
                    24. | �11440 less than �12480 11440 12480 11960 |
                    25. | �10400 less than �11440 10400 11440 10920 |
                    26. | Don't know . . . |
                    27. | Don't know . . . |
                    28. | �26000 less than �28600 26000 28600 27300 |
                    29. | �8320 less than �9360 8320 9360 8840 |
                    30. | �36,400 or more 36400 . 36400 |
                    31. | �33800 less than �36400 33800 36400 35100 |
                    32. | �5200 Less than �6240 5200 6240 5720 |
                    33. | Refused . . . |
                    34. | �11440 less than �12480 11440 12480 11960 |
                    35. | Don't know . . . |
                    36. | Don't know . . . |
                    37. | Don't know . . . |
                    38. | �15600 less than �16640 15600 16640 16120 |
                    39. | Don't know . . . |
                    40. | �26000 less than �28600 26000 28600 27300 |
                    41. | �15600 less than �16640 15600 16640 16120 |
                    42. | �6240 less than �7280 6240 7280 6760 |
                    43. | Don't know . . . |
                    44. | �10400 less than �11440 10400 11440 10920 |
                    45. | Don't know . . . |
                    46. | Refused . . . |
                    47. | �36,400 or more 36400 . 36400 |
                    48. | �36,400 or more 36400 . 36400 |
                    49. | �8320 less than �9360 8320 9360 8840 |
                    50. | �23400 less than �26000 23400 26000 24700 |
                    51. | �5200 Less than �6240 5200 6240 5720 |
                    52. | �8320 less than �9360 8320 9360 8840 |
                    53. | �17680 less than �18720 17680 18720 18200 |
                    54. | �36,400 or more 36400 . 36400 |
                    55. | �8320 less than �9360 8320 9360 8840 |
                    56. | Refused . . . |
                    57. | Refused . . . |
                    58. | Refused . . . |
                    59. | �28600 less than �31200 28600 31200 29900 |
                    60. | �20800 less than �23400 20800 23400 22100 |
                    61. | �15600 less than �16640 15600 16640 16120 |
                    62. | Don't know . . . |
                    63. | �20800 less than �23400 20800 23400 22100 |
                    64. | �28600 less than �31200 28600 31200 29900 |
                    65. | Refused . . . |
                    66. | �10400 less than �11440 10400 11440 10920 |
                    67. | �4160 less than �4680 4160 4680 4420 |
                    68. | MP not interviewed . . . |
                    69. | �10400 less than �11440 10400 11440 10920 |
                    70. | �26000 less than �28600 26000 28600 27300 |
                    71. | �36,400 or more 36400 . 36400 |
                    72. | �17680 less than �18720 17680 18720 18200 |
                    73. | �11440 less than �12480 11440 12480 11960 |
                    74. | �33800 less than �36400 33800 36400 35100 |
                    75. | �26000 less than �28600 26000 28600 27300 |
                    76. | �36,400 or more 36400 . 36400 |
                    77. | Don't know . . . |
                    78. | �36,400 or more 36400 . 36400 |
                    79. | �18720 less than �19760 18720 19760 19240 |
                    80. | �11440 less than �12480 11440 12480 11960 |
                    81. | �12480 less than �13520 12480 13520 13000 |
                    82. | Refused . . . |
                    83. | Refused . . . |
                    84. | �33800 less than �36400 33800 36400 35100 |
                    85. | �36,400 or more 36400 . 36400 |
                    86. | �36,400 or more 36400 . 36400 |
                    87. | �36,400 or more 36400 . 36400 |
                    88. | �33800 less than �36400 33800 36400 35100 |
                    89. | Refused . . . |
                    90. | �8320 less than �9360 8320 9360 8840 |
                    91. | �12480 less than �13520 12480 13520 13000 |
                    92. | �12480 less than �13520 12480 13520 13000 |
                    93. | �14560 less than �15600 14560 15600 15080 |
                    94. | �8320 less than �9360 8320 9360 8840 |
                    95. | �13520 less than �14560 13520 14560 14040 |
                    96. | �19760 less than �20800 19760 20800 20280 |
                    97. | �10400 less than �11440 10400 11440 10920 |
                    98. | �36,400 or more 36400 . 36400 |
                    99. | Don't know . . . |
                    100. | �20800 less than �23400 20800 23400 22100 |
                    +----------------------------------------------------+
                    
                    .
                    Hi again! Thanks so much for your help, I definitely did not know such commands so will make sure to go through them thoroughly . Thanks a mil, you really helped!

                    Comment


                    • #11
                      I came across this thread while attempting to do something similar, to assign numeric values to an ordinal income variable which has the first k integers as existing values but scaled, variable-range bands as the value label. Many American social research data-sets have this for their income label, so this seems like a useful general tool to me. Below are the data (from the General Social Survey 1994), and after that is my code.

                      Code:
                      * Example generated by -dataex-. For more info, type help dataex
                      clear
                      input float rincom91
                      22
                      11
                      12
                       9
                      14
                       9
                      12
                      12
                      22
                      18
                      16
                      21
                      18
                      18
                       9
                       9
                      20
                      15
                      19
                      16
                      10
                      21
                      18
                      22
                      21
                      14
                      18
                       2
                      14
                      12
                      18
                      10
                      12
                      15
                      16
                      22
                      22
                      15
                      10
                      16
                      22
                       9
                       8
                       5
                      21
                      21
                      21
                      21
                       8
                       7
                      22
                      20
                      22
                       2
                      22
                      19
                      20
                      21
                      19
                       2
                      11
                      18
                      12
                       3
                      16
                      13
                      22
                       3
                       9
                       4
                      22
                      15
                      12
                      14
                      10
                      13
                      18
                      13
                      19
                       4
                      21
                      15
                      18
                      17
                      17
                      19
                       8
                      15
                      18
                      16
                       8
                      16
                      12
                      12
                      20
                      18
                      19
                      15
                      21
                      14
                      end
                      label values rincom91 rincom91
                      label def rincom91 2 "$1000-2999", modify
                      label def rincom91 3 "$3000-3999", modify
                      label def rincom91 4 "$4000-4999", modify
                      label def rincom91 5 "$5000-5999", modify
                      label def rincom91 7 "$7000-7999", modify
                      label def rincom91 8 "$8000-9999", modify
                      label def rincom91 9 "$10000-12499", modify
                      label def rincom91 10 "$12500-14999", modify
                      label def rincom91 11 "$15000-17499", modify
                      label def rincom91 12 "$17500-19999", modify
                      label def rincom91 13 "$20000-22499", modify
                      label def rincom91 14 "$22500-24999", modify
                      label def rincom91 15 "$25000-29999", modify
                      label def rincom91 16 "$30000-34999", modify
                      label def rincom91 17 "$35000-39999", modify
                      label def rincom91 18 "$40000-49999", modify
                      label def rincom91 19 "$50000-59999", modify
                      label def rincom91 20 "$60000-74999", modify
                      label def rincom91 21 "$75000+", modify
                      label def rincom91 22 "refused", modify
                      The following code will need some modifications depending on the data-set, of course; sorry for posting an answer that works specifically for my data, but hopefully it's still useful (you might want to use the word() function).

                      Code:
                      gen realinc = .
                              
                              lab def rincom91 1 "$0-1000", modify
                              
                              * Here's a fast way to take value labels and do midpoint imputation 
                              * using them, inspired to some extent by this answer on Stackexchange: 
                                  * https://archive.ph/HxE2G
                              
                              levelsof rincom91, local(levs)
                                  * This stores all possible values of the variable in a local 
                                  * called "levs"
                              local inclab : value label rincom91
                                  * We also put the value label itself into a local, using Stata's
                                  * extended macro functions, which let us use the syntax above (for
                                  * more on this, see p. 4 of this and on: 
                                      * https://www.stata.com/manuals13/pmacro.pdf)
                      
                              foreach k of local levs {
                                  local strlab : label `inclab' `k'
                                      * What's happening here? We access the string label associated
                                          * with numeric value k, for k in the set of all possible
                                          * values of income
                                  local n2 = strrpos("`strlab'", "-") - 2
                                      * Now, I want to turn that string into two different numbers
                                      * but the problem is that the numbers are of variable length
                                      * so I just say "tell me at what position in the string the 
                                      * hyphen first occurs" and store that as n2. I subtract two 
                                      * for reasons mentioned below
                                  local lb = substr("`strlab'", 2, `n2')
                                      * Now, I get the lower bound of the interval by taking the sub-
                                      * string from the second position of the string label for a 
                                      * length of n2. Now we see why I took away two above in defining
                                      * n2: I didn't want to include the hyphen itself, and I was 
                                      * starting the string after the first character, a dollar sign
                                  local ub = substr("`strlab'", `n2'+3, strlen("`strlab'"))
                                      * Now I get the ub. I add three to n2 to start the second string
                                      * after the hyphen and then go until the end of the string label
                                      * using the length of the string
                                  replace realinc = round((real("`lb'") +  real("`ub'"))/2, 1)  ///
                                      if rincom91 == `k'
                                      * Finally, I replace income with the average of my two values
                              }

                      Comment

                      Working...
                      X