Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combine coded variables on age

    Dear all,

    I have panel data with "age" as one of my variables. The original data includes individual age in increments of 5 years (e.g. 25-30yrs; 30-35yrs). But I want to recode the data in increments of 10, such that I can have e.g. 20-30yrs. If the data is coded (i.e. 25-30 is coded as 5 and 30-35 coded as 6), how can I achieve this goal?

    Thank you in advance!

    Kind Regards,
    Guest
    Last edited by sladmin; 25 Apr 2022, 08:28. Reason: anonymize original poster

  • #2
    Do you have individual-level ages, or are they merely in categories?

    Comment


    • #3
      Code:
      help recode
      but this will only work if my assumptions about your data are correct; please read the FAQ and follow its advice on showing data examples using -dataex-

      Comment


      • #4
        thank you Jared Greathouse and Rich Goldstein for your prompt responses. Here is the data example:

        ----------------------- copy starting from the next line -----------------------
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte age long _freq
         1     9
         2 10710
         3 11698
         4 26503
         5 24738
         6 27459
         7 30617
         8 33956
         9 35862
        10 34477
        11 31745
        12 29500
        13 86965
         .    19
        end
        label values age b_agegr13_dv
        label def b_agegr13_dv 1 "0-15 years old", modify
        label def b_agegr13_dv 2 "16-17 years old", modify
        label def b_agegr13_dv 3 "18-19 years old", modify
        label def b_agegr13_dv 4 "20-24 years old", modify
        label def b_agegr13_dv 5 "25-29 years old", modify
        label def b_agegr13_dv 6 "30-34 years old", modify
        label def b_agegr13_dv 7 "35-39 years old", modify
        label def b_agegr13_dv 8 "40-44 years old", modify
        label def b_agegr13_dv 9 "45-49 years old", modify
        label def b_agegr13_dv 10 "50-54 years old", modify
        label def b_agegr13_dv 11 "55-59 years old", modify
        label def b_agegr13_dv 12 "60-64 years old", modify
        label def b_agegr13_dv 13 "65 years or older", modify

        Comment


        • #5
          It should be easier to create a new variable rather than decoding the current variable. Assuming that your continuous age variable is named "age", here is an example that uses the -ceiling()- function:

          Code:
          set obs 2000
          set seed 03302022
          gen age = runiformint(0, 120)
          *START HERE
          gen agecat= ceil(age/10)
          replace agecat=1 if age==0
          replace agecat= 8 if age>70 & !missing(age)
          local agelab
          local i 0
          forval j=1/7{
              local agelab `"`agelab' `j' "`i' - `=`i'+10' years old""'
              local i= 10+`i'
          }
          lab def agelab `agelab' 8 "71 years or older"
          label values agecat agelab
          tab agecat
          list age agecat in 1/20, sep(0)
          Res.:

          Code:
          . 
          . tab agecat
          
                     agecat |      Freq.     Percent        Cum.
          ------------------+-----------------------------------
           0 - 10 years old |        192        9.60        9.60
          10 - 20 years old |        159        7.95       17.55
          20 - 30 years old |        154        7.70       25.25
          30 - 40 years old |        154        7.70       32.95
          40 - 50 years old |        151        7.55       40.50
          50 - 60 years old |        161        8.05       48.55
          60 - 70 years old |        199        9.95       58.50
          71 years or older |        830       41.50      100.00
          ------------------+-----------------------------------
                      Total |      2,000      100.00
          
          . 
          . list age agecat in 1/20, sep(0)
          
               +-------------------------+
               | age              agecat |
               |-------------------------|
            1. |  24   20 - 30 years old |
            2. |  27   20 - 30 years old |
            3. |  61   60 - 70 years old |
            4. |  33   30 - 40 years old |
            5. | 104   71 years or older |
            6. |  16   10 - 20 years old |
            7. | 118   71 years or older |
            8. |  47   40 - 50 years old |
            9. |   5    0 - 10 years old |
           10. |  82   71 years or older |
           11. |   3    0 - 10 years old |
           12. |  75   71 years or older |
           13. |   6    0 - 10 years old |
           14. | 104   71 years or older |
           15. |  76   71 years or older |
           16. |  54   50 - 60 years old |
           17. |  65   60 - 70 years old |
           18. |  51   50 - 60 years old |
           19. |  18   10 - 20 years old |
           20. |  98   71 years or older |
               +-------------------------+
          
          .

          Comment


          • #6
            thank you, Andrew Musau for your help! I have entered the code you generated, but my output seems wrong because now all ages are between 0-10. This approach is new to me - have I gone wrong somewhere? Thank you for your time!

            Code:
            gen agecat= ceil(age/10)
            (19 missing values generated)
            
            . replace agecat=1 if age==0
            (0 real changes made)
            
            . replace agecat= 8 if age>70 & !missing(age)
            (0 real changes made)
            
            . local agelab
            
            . local i 0
            
            . forval j=1/7{
              2.     local agelab `"`agelab' `j' "`i' - `=`i'+10' years old""'
              3.     local i= 10+`i'
              4. }
            
            . lab def agelab `agelab' 8 "71 years or older"
            label agelab already defined
            r(110);
            
            . label values agecat agelab
            
            . tab agecat
            
                       agecat |      Freq.     Percent        Cum.
            ------------------+-----------------------------------
             0 - 10 years old |    236,029       61.43       61.43
                            2 |    148,210       38.57      100.00
            ------------------+-----------------------------------
                        Total |    384,239      100.00
            
            . list age agecat in 1/20, sep(0)
            
                 +-----------------------------+
                 |      age             agecat |
                 |-----------------------------|
              1. | 25-29 ye   0 - 10 years old |
              2. | 25-29 ye   0 - 10 years old |
              3. | 25-29 ye   0 - 10 years old |
              4. | 30-34 ye   0 - 10 years old |
              5. | 30-34 ye   0 - 10 years old |
              6. | 30-34 ye   0 - 10 years old |
              7. | 30-34 ye   0 - 10 years old |
              8. | 30-34 ye   0 - 10 years old |
              9. | 35-39 ye   0 - 10 years old |
             10. | 35-39 ye   0 - 10 years old |
             11. | 35-39 ye   0 - 10 years old |
             12. | 35-39 ye   0 - 10 years old |
             13. | 40-44 ye   0 - 10 years old |
             14. | 40-44 ye   0 - 10 years old |
             15. | 40-44 ye   0 - 10 years old |
             16. | 30-34 ye   0 - 10 years old |
             17. | 30-34 ye   0 - 10 years old |
             18. | 30-34 ye   0 - 10 years old |
             19. | 35-39 ye   0 - 10 years old |
             20. | 35-39 ye   0 - 10 years old |
                 +-----------------------------+
            
            . tab agecat
            
                       agecat |      Freq.     Percent        Cum.
            ------------------+-----------------------------------
             0 - 10 years old |    236,029       61.43       61.43
                            2 |    148,210       38.57      100.00
            ------------------+-----------------------------------
                        Total |    384,239      100.00

            Comment


            • #7
              Apologies Guest, I was under the impression that you had a continuous age variable.

              The original data includes individual age in increments of 5 years (e.g. 25-30yrs; 30-35yrs). But I want to recode the data in increments of 10, such that I can have e.g. 20-30yrs. If the data is coded (i.e. 25-30 is coded as 5 and 30-35 coded as 6), how can I achieve this goal?
              Once a continuous variable is categorized, you are constrained to the cutoffs chosen by the individual who categorized the variable. In other words, there is no way to extract the age group 10-20 years from a variable with the categories 0-16 years and 16-24 years. That's why it is often stated that categorizing continuous variables is throwing away information. We do not know the distribution of ages within a category. That said, working with the available cut-points, something along the lines of the following using recode as Rich suggested would work:

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input byte age long _freq
               1     9
               2 10710
               3 11698
               4 26503
               5 24738
               6 27459
               7 30617
               8 33956
               9 35862
              10 34477
              11 31745
              12 29500
              13 86965
               .    19
              end
              label values age b_agegr13_dv
              label def b_agegr13_dv 1 "0-15 years old", modify
              label def b_agegr13_dv 2 "16-17 years old", modify
              label def b_agegr13_dv 3 "18-19 years old", modify
              label def b_agegr13_dv 4 "20-24 years old", modify
              label def b_agegr13_dv 5 "25-29 years old", modify
              label def b_agegr13_dv 6 "30-34 years old", modify
              label def b_agegr13_dv 7 "35-39 years old", modify
              label def b_agegr13_dv 8 "40-44 years old", modify
              label def b_agegr13_dv 9 "45-49 years old", modify
              label def b_agegr13_dv 10 "50-54 years old", modify
              label def b_agegr13_dv 11 "55-59 years old", modify
              label def b_agegr13_dv 12 "60-64 years old", modify
              label def b_agegr13_dv 13 "65 years or older", modify
              
              recode age (3 4=2) (5 6=3) (7 8 =4) (9 10=5) (11 12 = 6) (13=7), gen(agecat)
              lab def agecat 1 "0-15 years old" 2 "16-24 years old" 3 "25-34 years old" 4 "35-44 years old" 5 "45-54 years old" 6 "55-64 years old" 7 "65 years or older"
              lab values agecat agecat
              tab agecat
              l age agecat, sepby(agecat)
              Res.:

              Code:
              . tab agecat
              
                  RECODE of age |      Freq.     Percent        Cum.
              ------------------+-----------------------------------
                 0-15 years old |          1        7.69        7.69
                16-24 years old |          3       23.08       30.77
                25-34 years old |          2       15.38       46.15
                35-44 years old |          2       15.38       61.54
                45-54 years old |          2       15.38       76.92
                55-64 years old |          2       15.38       92.31
              65 years or older |          1        7.69      100.00
              ------------------+-----------------------------------
                          Total |         13      100.00
              
              .
              . l age agecat, sepby(agecat)
              
                   +---------------------------------------+
                   |               age              agecat |
                   |---------------------------------------|
                1. |    0-15 years old      0-15 years old |
                   |---------------------------------------|
                2. |   16-17 years old     16-24 years old |
                3. |   18-19 years old     16-24 years old |
                4. |   20-24 years old     16-24 years old |
                   |---------------------------------------|
                5. |   25-29 years old     25-34 years old |
                6. |   30-34 years old     25-34 years old |
                   |---------------------------------------|
                7. |   35-39 years old     35-44 years old |
                8. |   40-44 years old     35-44 years old |
                   |---------------------------------------|
                9. |   45-49 years old     45-54 years old |
               10. |   50-54 years old     45-54 years old |
                   |---------------------------------------|
               11. |   55-59 years old     55-64 years old |
               12. |   60-64 years old     55-64 years old |
                   |---------------------------------------|
               13. | 65 years or older   65 years or older |
                   |---------------------------------------|
               14. |                 .                   . |
                   +---------------------------------------+
              Last edited by sladmin; 25 Apr 2022, 08:28. Reason: anonymize original poster

              Comment


              • #8
                thank you Andrew Musau that code worked perfectly!!

                Comment

                Working...
                X