Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computing the quantiles of a varibale based on a v ariof percentiles

    Hello.
    I have a panel with regions and years. My main variable of interest, index, is missing for some regions in some years. I am trying to imput those missing values.

    รง

    To do so, I've calculated the percentile of index and I have calculated its mean for each region within each year. For example:
    Code:
    .  list region_id year index perc mean_perc, nol
    
       
         +---------------------------------------------------+
         | region~d   year       index       perc   mean_p~c |
         |---------------------------------------------------|
      1. |        1   1990   -.0879496   .6528497   .2710086 |
      2. |        1   1991   -.4667637   .0351759   .2710086 |
      3. |        1   1992   -.1709576       .125   .2710086 |
      4. |        1   1993           .          .   .2710086 |
      5. |        2   1990    -.462625   .1398964   .3006104 |
         |---------------------------------------------------|
      6. |        2   1991   -.0563047   .3869347   .3006104 |
      7. |        2   1992    .1408911       .375   .3006104 |
      8. |        2   1993           .          .   .3006104 |
      9. |        3   1990   -.3460146   .2746114   .3954145 |
     10. |        3   1991   -.0690994   .3718593   .3954145 |
         |---------------------------------------------------|
     11. |        3   1992    .3073938   .5397727   .3954145 |
     12. |        3   1993           .          .   .3954145 |
     13. |        4   1990   -.6537067   .0259067    .125898 |
     14. |        4   1991   -.1824378   .2211055    .125898 |
     15. |        4   1992   -.1649489   .1306818    .125898 |
         |---------------------------------------------------|
     16. |        4   1993           .          .    .125898 |
     17. |        5   1990   -.5772001   .0518135   .1571086 |
     18. |        5   1991   -.0987434   .3115578   .1571086 |
     19. |        5   1992   -.1815233   .1079545   .1571086 |
     20. |        5   1993           .          .   .1571086 |
         |---------------------------------------------------|
     21. |        6   1990   -.5690967   .0673575    .207459 |
     22. |        6   1991   -.1751851   .2311558    .207459 |
     23. |        6   1992    .0894242   .3238636    .207459 |
     24. |        6   1993           .          .    .207459 |
     25. |        7   1990   -.6956265    .015544    .139356 |
         |---------------------------------------------------|
     26. |        7   1991   -.4338617   .0502513    .139356 |
     27. |        7   1992    .1256393   .3522727    .139356 |
     28. |        7   1993           .          .    .139356 |
     29. |        8   1990   -.6792535   .0207254   .2363321 |
     30. |        8   1991    -.026226   .4723618   .2363321 |
         |---------------------------------------------------|
     31. |        8   1992    -.058143   .2159091   .2363321 |
     32. |        8   1993           .          .   .2363321 |
         +---------------------------------------------------+
    
    . sum index perc mean_perc
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
           index |         24   -.2288466    .2823488  -.6956265   .3073938
            perc |         24    .2291484    .1788557    .015544   .6528497
       mean_perc |         32    .2291484    .0872065    .125898   .3954145
    I want to create a variable (x) that contains the quantile value of index corresponding to the mean percentile (mean_perc) for each region within each year. That is, x should contain the value of index that corresponds to the percentile in mean_perc, so, for instance, if mean_perc = 0.5, x should indicate what value of index is at the median; if mean_perc = 0.25, x should indicate what value of index would represent the 25th percentile.

    I know that in R, this can be achieved using this command:
    Code:
      data <- within(data, imputed <- quantile(index, c(mean_perc), na.rm = TRUE))
    but I'm trying to find the way to achieve this in STATA.
    Thanks!

  • #2
    Consider

    Code:
    sysuse auto, clear
    _pctile mpg, nq(100)
    return list
    Res.:

    Code:
    scalars:
                     r(r1) =  12
                     r(r2) =  12
                     r(r3) =  14
                     r(r4) =  14
                     r(r5) =  14
                     r(r6) =  14
                     r(r7) =  14
                     r(r8) =  14
                     r(r9) =  14
                    r(r10) =  14
                    r(r11) =  15
                    r(r12) =  15
                    r(r13) =  15
                    r(r14) =  16
                    r(r15) =  16
                    r(r16) =  16
                    r(r17) =  16
                    r(r18) =  16
                    r(r19) =  17
                    r(r20) =  17
                    r(r99) =  41
                    r(r98) =  35
                    r(r97) =  35
                    r(r96) =  35
                    r(r95) =  34
                    r(r94) =  31
                    r(r93) =  30
                    r(r92) =  30
                    r(r91) =  30
                    r(r90) =  29
                    r(r89) =  28
                    r(r88) =  28
                    r(r87) =  28
                    r(r86) =  28
                    r(r85) =  26
                    r(r84) =  26
                    r(r83) =  26
                    r(r82) =  26
                    r(r81) =  25
                    r(r80) =  25
                    r(r79) =  25
                    r(r78) =  25
                    r(r77) =  25
                    r(r76) =  25
                    r(r75) =  25
                    r(r74) =  24
                    r(r73) =  24
                    r(r72) =  24
                    r(r71) =  24
                    r(r70) =  24
                    r(r69) =  24
                    r(r68) =  23
                    r(r67) =  23
                    r(r66) =  23
                    r(r65) =  23
                    r(r64) =  22
                    r(r63) =  22
                    r(r62) =  22
                    r(r61) =  22
                    r(r60) =  22
                    r(r59) =  22
                    r(r58) =  21
                    r(r57) =  21
                    r(r56) =  21
                    r(r55) =  21
                    r(r54) =  21
                    r(r53) =  21
                    r(r52) =  21
                    r(r51) =  20
                    r(r50) =  20
                    r(r49) =  20
                    r(r48) =  20
                    r(r47) =  19
                    r(r46) =  19
                    r(r45) =  19
                    r(r44) =  19
                    r(r43) =  19
                    r(r42) =  19
                    r(r41) =  19
                    r(r40) =  19
                    r(r39) =  19
                    r(r38) =  19
                    r(r37) =  19
                    r(r36) =  18
                    r(r35) =  18
                    r(r34) =  18
                    r(r33) =  18
                    r(r32) =  18
                    r(r31) =  18
                    r(r30) =  18
                    r(r29) =  18
                    r(r28) =  18
                    r(r27) =  18
                    r(r26) =  18
                    r(r25) =  18
                    r(r24) =  17
                    r(r23) =  17
                    r(r22) =  17
                    r(r21) =  17
    So one approach would be to pick up the value from r(). That would entail some rounding to the nearest percentile.

    Comment

    Working...
    X