Please consider the example toy data below. The data is contrived but for our purposes each row represents an individual recently released from jail. The month they are released is the variable "start_modate." All these individuals are re-incarcerated k number of months later (variable "length"). The month they are re-incarcerated is the variable "end_modate." I have monthly data about reported crime in the area where they reside after release. The variables that start with the prefix "v_" measure the crime rate. The suffix for all the "v_" variables denotes the month during which the crime rate was measured.
I wish to calculate a rowwise average for each individual over only the period they were living in the community. In other words, I wish to average the crime rate over only the months each individual is not incarcerated, which are the months between "start_modate" and "end_modate" (inclusive).
I have tried looping over each row and using the -ereplace- package to calculate a conditional average using -rowmean-, but this was terribly inefficient and slow. The real data has over 16 million rows so looping over each row in this fashion was not suitable.
Thank you for your time; I sincerely appreciate your attention and any suggestions you may have.
I wish to calculate a rowwise average for each individual over only the period they were living in the community. In other words, I wish to average the crime rate over only the months each individual is not incarcerated, which are the months between "start_modate" and "end_modate" (inclusive).
I have tried looping over each row and using the -ereplace- package to calculate a conditional average using -rowmean-, but this was terribly inefficient and slow. The real data has over 16 million rows so looping over each row in this fashion was not suitable.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str7(start_modate end_modate) byte duration float(v_690 v_691 v_692 v_693 v_694 v_695 v_696 v_697 v_698 v_699 v_700) "2019m6" "2019m11" 5 72.499 82.165 80.554 73.304 53.971 53.971 69.784 74.71 57.469 59.111 64.037 "2017m10" "2018m4" 6 12.356 12.846 17.923 15.048 13.824 15.109 12.906 16.549 15.87 16.364 13.955 "2018m6" "2019m1" 7 26.306 32.305 35.997 31.151 23.537 20.998 19.711 22.56 23.273 23.035 23.985 "2018m11" "2019m7" 8 8.841 7.073 10.609 17.682 14.145 7.073 13.076 14.944 14.944 9.34 14.944 "2017m9" "2018m6" 9 12.356 12.846 17.923 15.048 13.824 15.109 12.906 16.549 15.87 16.364 13.955 "2019m2" "2019m12" 10 21.182 21.182 17.331 19.257 15.79 13.48 7.232 5.53 8.083 5.956 5.956 end
Comment