Hi everyone,
This is my first time posting on Statalist, so I hope my question is clear.
- date variable: I dont' know why it doesn't appear in the %td format but the first line below is 26sept2001, date takes date values for every day between 04jan2000 and 01jan2021 for each country
- protest variable: takes a null or non-null value (can be larger than 1) depending on the number of protests that happened on this date in this country. Here on the 26th September 2001, 6 protests happened in Afghanistan. When 0, it means that no protest has occured on this date, in this country.
Please note that my dataset is pretty heavy, with 1,510,596 rows.
Here is how I coded it until now:
But I wanted to know if there is any way to automate this with a loop or another command? Indeed, at some point I am going to do the same process but with month as the reference time period (and not week), and I don't want to have to hand code "intensity_lag1" up to _n-28, _n-29, _n-30, _n-31 etc.
Here is what I tried but didn't work: (I also tried with the sum() function and it didn't work)
What would make my life much easier would be a sum function that works just as the Sigma, where I could do: \Sigma_{i=j}^{j_7} protest_i.
If you have any suggestion, I would love to hear them, I have been struggling with that for a few days now!
Thanks a lot,
Marie
This is my first time posting on Statalist, so I hope my question is clear.
- Here is a short example of my dataset:
- date variable: I dont' know why it doesn't appear in the %td format but the first line below is 26sept2001, date takes date values for every day between 04jan2000 and 01jan2021 for each country
- protest variable: takes a null or non-null value (can be larger than 1) depending on the number of protests that happened on this date in this country. Here on the 26th September 2001, 6 protests happened in Afghanistan. When 0, it means that no protest has occured on this date, in this country.
Please note that my dataset is pretty heavy, with 1,510,596 rows.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str24 isoname float(date protest) "Afghanistan" 15244 6 "Afghanistan" 15245 0 "Afghanistan" 15246 1 "Afghanistan" 15247 0 "Afghanistan" 15248 0 "Afghanistan" 15249 0 "Afghanistan" 15250 0 "Afghanistan" 15251 0 "Afghanistan" 15252 0 "Afghanistan" 15253 0 "Afghanistan" 15254 0 "Afghanistan" 15255 0 "Afghanistan" 15256 0 "Afghanistan" 15257 1 "Afghanistan" 15258 0 "Afghanistan" 15259 1 "Afghanistan" 15260 1 "Afghanistan" 15261 0 end format %td date
- My question is the following:
Here is how I coded it until now:
Code:
bys isoname : gen intensity_lag1 = protest[_n-1] + protest[_n-2] + protest[_n-3] + protest[_n-4] + protest[_n-5] + protest[_n-6] + protest[_n-7] bys isoname : gen intensity_lag2 = protest[_n-8] + protest[_n-9] + protest[_n-10] + protest[_n-11] + protest[_n-12] + protest[_n-13] + protest[_n-14]
Here is what I tried but didn't work: (I also tried with the sum() function and it didn't work)
Code:
// Attempt1: bys isoname : gen id = _n bys isoname : egen intensity_lag1v2 = total(protest) if inrange(id, 7-id, id) //Attempt2: foreach i of num 1/7 { bys isoname : egen intensity_lag1v3 = total(protest[_n-`i']) } //Attempt3: foreach i in 1/7 { bys isoname : egen intensity_lag1v4 = total(protest[_n-`i']) }
If you have any suggestion, I would love to hear them, I have been struggling with that for a few days now!
Thanks a lot,
Marie
Comment