Computing the quantiles of a varibale based on a v ariof percentiles

natalia carralero

Join Date: Jan 2020
Posts: 5

Computing the quantiles of a varibale based on a v ariof percentiles

07 Jun 2024, 04:04

Hello.
I have a panel with regions and years. My main variable of interest, index, is missing for some regions in some years. I am trying to imput those missing values.

ç

To do so, I've calculated the percentile of index and I have calculated its mean for each region within each year. For example:

Code:

.  list region_id year index perc mean_perc, nol

   
     +---------------------------------------------------+
     | region~d   year       index       perc   mean_p~c |
     |---------------------------------------------------|
  1. |        1   1990   -.0879496   .6528497   .2710086 |
  2. |        1   1991   -.4667637   .0351759   .2710086 |
  3. |        1   1992   -.1709576       .125   .2710086 |
  4. |        1   1993           .          .   .2710086 |
  5. |        2   1990    -.462625   .1398964   .3006104 |
     |---------------------------------------------------|
  6. |        2   1991   -.0563047   .3869347   .3006104 |
  7. |        2   1992    .1408911       .375   .3006104 |
  8. |        2   1993           .          .   .3006104 |
  9. |        3   1990   -.3460146   .2746114   .3954145 |
 10. |        3   1991   -.0690994   .3718593   .3954145 |
     |---------------------------------------------------|
 11. |        3   1992    .3073938   .5397727   .3954145 |
 12. |        3   1993           .          .   .3954145 |
 13. |        4   1990   -.6537067   .0259067    .125898 |
 14. |        4   1991   -.1824378   .2211055    .125898 |
 15. |        4   1992   -.1649489   .1306818    .125898 |
     |---------------------------------------------------|
 16. |        4   1993           .          .    .125898 |
 17. |        5   1990   -.5772001   .0518135   .1571086 |
 18. |        5   1991   -.0987434   .3115578   .1571086 |
 19. |        5   1992   -.1815233   .1079545   .1571086 |
 20. |        5   1993           .          .   .1571086 |
     |---------------------------------------------------|
 21. |        6   1990   -.5690967   .0673575    .207459 |
 22. |        6   1991   -.1751851   .2311558    .207459 |
 23. |        6   1992    .0894242   .3238636    .207459 |
 24. |        6   1993           .          .    .207459 |
 25. |        7   1990   -.6956265    .015544    .139356 |
     |---------------------------------------------------|
 26. |        7   1991   -.4338617   .0502513    .139356 |
 27. |        7   1992    .1256393   .3522727    .139356 |
 28. |        7   1993           .          .    .139356 |
 29. |        8   1990   -.6792535   .0207254   .2363321 |
 30. |        8   1991    -.026226   .4723618   .2363321 |
     |---------------------------------------------------|
 31. |        8   1992    -.058143   .2159091   .2363321 |
 32. |        8   1993           .          .   .2363321 |
     +---------------------------------------------------+

. sum index perc mean_perc

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       index |         24   -.2288466    .2823488  -.6956265   .3073938
        perc |         24    .2291484    .1788557    .015544   .6528497
   mean_perc |         32    .2291484    .0872065    .125898   .3954145

I want to create a variable (x) that contains the quantile value of index corresponding to the mean percentile (mean_perc) for each region within each year. That is, x should contain the value of index that corresponds to the percentile in mean_perc, so, for instance, if mean_perc = 0.5, x should indicate what value of index is at the median; if mean_perc = 0.25, x should indicate what value of index would represent the 25th percentile.

I know that in R, this can be achieved using this command:

Code:

  data <- within(data, imputed <- quantile(index, c(mean_perc), na.rm = TRUE))

but I'm trying to find the way to achieve this in STATA.
Thanks!

Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10195

07 Jun 2024, 07:54

Consider

Code:

sysuse auto, clear
_pctile mpg, nq(100)
return list

Res.:

Code:

scalars:
                 r(r1) =  12
                 r(r2) =  12
                 r(r3) =  14
                 r(r4) =  14
                 r(r5) =  14
                 r(r6) =  14
                 r(r7) =  14
                 r(r8) =  14
                 r(r9) =  14
                r(r10) =  14
                r(r11) =  15
                r(r12) =  15
                r(r13) =  15
                r(r14) =  16
                r(r15) =  16
                r(r16) =  16
                r(r17) =  16
                r(r18) =  16
                r(r19) =  17
                r(r20) =  17
                r(r99) =  41
                r(r98) =  35
                r(r97) =  35
                r(r96) =  35
                r(r95) =  34
                r(r94) =  31
                r(r93) =  30
                r(r92) =  30
                r(r91) =  30
                r(r90) =  29
                r(r89) =  28
                r(r88) =  28
                r(r87) =  28
                r(r86) =  28
                r(r85) =  26
                r(r84) =  26
                r(r83) =  26
                r(r82) =  26
                r(r81) =  25
                r(r80) =  25
                r(r79) =  25
                r(r78) =  25
                r(r77) =  25
                r(r76) =  25
                r(r75) =  25
                r(r74) =  24
                r(r73) =  24
                r(r72) =  24
                r(r71) =  24
                r(r70) =  24
                r(r69) =  24
                r(r68) =  23
                r(r67) =  23
                r(r66) =  23
                r(r65) =  23
                r(r64) =  22
                r(r63) =  22
                r(r62) =  22
                r(r61) =  22
                r(r60) =  22
                r(r59) =  22
                r(r58) =  21
                r(r57) =  21
                r(r56) =  21
                r(r55) =  21
                r(r54) =  21
                r(r53) =  21
                r(r52) =  21
                r(r51) =  20
                r(r50) =  20
                r(r49) =  20
                r(r48) =  20
                r(r47) =  19
                r(r46) =  19
                r(r45) =  19
                r(r44) =  19
                r(r43) =  19
                r(r42) =  19
                r(r41) =  19
                r(r40) =  19
                r(r39) =  19
                r(r38) =  19
                r(r37) =  19
                r(r36) =  18
                r(r35) =  18
                r(r34) =  18
                r(r33) =  18
                r(r32) =  18
                r(r31) =  18
                r(r30) =  18
                r(r29) =  18
                r(r28) =  18
                r(r27) =  18
                r(r26) =  18
                r(r25) =  18
                r(r24) =  17
                r(r23) =  17
                r(r22) =  17
                r(r21) =  17

So one approach would be to pick up the value from r(). That would entail some rounding to the nearest percentile.

Announcement

Computing the quantiles of a varibale based on a v ariof percentiles

Comment