Hi there,
I have a time series dataset with an ID variable and a date of death variable along with information on latitude and longitude. I would like to expand the dataset to include control days (no death) which within are the same day of the week in the same month of the same year. So each individual serves as their own control. There should will be three to four control days depending on the length of the month.
Here an example of the dataset that has been changed altered to make sure it is anonymised. There are about 1,100 observations.
Code:
input long n_eid double death_date float(lat lon death) 425 22478 56.35 -3.2 1 387 22313 53.75 -0.85 1 218 22223 52.95 -6.6 1 471 22146 57.15 -3.25 1 583 22131 54.15 -.85 1 455 22361 54.25 -2.1 1
I calculated a stratum variable for each death date
Code:
* CREATE YEAR X MONTH X DOW STRATUM VARIABLE gen month=month(death_date) gen year=year(death_date) gen dow=dow(death_date) egen stratum_YMD=group(year month dow)
Code:
input long n_eid double death_date float(lat lon death) 425 22478 56.35 -3.2 1 425 22471 56.35 -3.2 0 425 22485 56.35 -3.2 0 425 22493 56.35 -3.2 0 387 22313 53.75 -0.85 1 387 22320 53.75 -0.85 0 387 22327 53.75 -0.85 0 387 22334 53.75 -0.85 0 218 22223 52.95 -6.6 1 218 22230 52.95 -6.6 0 218 22237 52.95 -6.6 0 218 22244 52.95 -6.6 0 218 22251 52.95 -6.6 0 471 22146 57.15 -3.25 1 471 22146 57.15 -3.25 0 471 22146 57.15 -3.25 0 471 22146 57.15 -3.25 0 . . .
Then, I would merge my environmental exposure to the dataset based on lat, lon and date and perform a conditional logistic regression for a time-stratified case-crossover analysis. But first I need to expand the dataset to calculate the control days for each case. Any help would be appreciated.
Comment