Hello.
I have a panel with regions and years. My main variable of interest, index, is missing for some regions in some years. I am trying to imput those missing values.
รง
To do so, I've calculated the percentile of index and I have calculated its mean for each region within each year. For example:
I want to create a variable (x) that contains the quantile value of index corresponding to the mean percentile (mean_perc) for each region within each year. That is, x should contain the value of index that corresponds to the percentile in mean_perc, so, for instance, if mean_perc = 0.5, x should indicate what value of index is at the median; if mean_perc = 0.25, x should indicate what value of index would represent the 25th percentile.
I know that in R, this can be achieved using this command:
but I'm trying to find the way to achieve this in STATA.
Thanks!
I have a panel with regions and years. My main variable of interest, index, is missing for some regions in some years. I am trying to imput those missing values.
รง
To do so, I've calculated the percentile of index and I have calculated its mean for each region within each year. For example:
Code:
. list region_id year index perc mean_perc, nol +---------------------------------------------------+ | region~d year index perc mean_p~c | |---------------------------------------------------| 1. | 1 1990 -.0879496 .6528497 .2710086 | 2. | 1 1991 -.4667637 .0351759 .2710086 | 3. | 1 1992 -.1709576 .125 .2710086 | 4. | 1 1993 . . .2710086 | 5. | 2 1990 -.462625 .1398964 .3006104 | |---------------------------------------------------| 6. | 2 1991 -.0563047 .3869347 .3006104 | 7. | 2 1992 .1408911 .375 .3006104 | 8. | 2 1993 . . .3006104 | 9. | 3 1990 -.3460146 .2746114 .3954145 | 10. | 3 1991 -.0690994 .3718593 .3954145 | |---------------------------------------------------| 11. | 3 1992 .3073938 .5397727 .3954145 | 12. | 3 1993 . . .3954145 | 13. | 4 1990 -.6537067 .0259067 .125898 | 14. | 4 1991 -.1824378 .2211055 .125898 | 15. | 4 1992 -.1649489 .1306818 .125898 | |---------------------------------------------------| 16. | 4 1993 . . .125898 | 17. | 5 1990 -.5772001 .0518135 .1571086 | 18. | 5 1991 -.0987434 .3115578 .1571086 | 19. | 5 1992 -.1815233 .1079545 .1571086 | 20. | 5 1993 . . .1571086 | |---------------------------------------------------| 21. | 6 1990 -.5690967 .0673575 .207459 | 22. | 6 1991 -.1751851 .2311558 .207459 | 23. | 6 1992 .0894242 .3238636 .207459 | 24. | 6 1993 . . .207459 | 25. | 7 1990 -.6956265 .015544 .139356 | |---------------------------------------------------| 26. | 7 1991 -.4338617 .0502513 .139356 | 27. | 7 1992 .1256393 .3522727 .139356 | 28. | 7 1993 . . .139356 | 29. | 8 1990 -.6792535 .0207254 .2363321 | 30. | 8 1991 -.026226 .4723618 .2363321 | |---------------------------------------------------| 31. | 8 1992 -.058143 .2159091 .2363321 | 32. | 8 1993 . . .2363321 | +---------------------------------------------------+ . sum index perc mean_perc Variable | Obs Mean Std. dev. Min Max -------------+--------------------------------------------------------- index | 24 -.2288466 .2823488 -.6956265 .3073938 perc | 24 .2291484 .1788557 .015544 .6528497 mean_perc | 32 .2291484 .0872065 .125898 .3954145
I know that in R, this can be achieved using this command:
Code:
data <- within(data, imputed <- quantile(index, c(mean_perc), na.rm = TRUE))
Thanks!
Comment