I want to causally study how lagged number of implementation (x1) affect performance (y1) for a panel dataset. This effect must include some delay in implementation, so I plan to use a 5 years lagged variable. Since almost all units in the panel apply this implementation, I assume standard DID analysis cannot be applied since I do not have typical control units (never takers). So, I am aware of this new method called staggered DID which fits for implementation in different years, and especially when there is no never taker (control units will be those treated, but in the pre treatment period).
So, in order to define the treatment, what I do is to create the growth rate for the number of implementations and take as treatment if a unit of the panel in a given year is above the 75 percentile from its own growth rate in the whole period (if you believe there would be a better way to define the treatment, please tell me, maybe this is a strong way of defining treatment, and using the threshold for all unit would give some never taker units which could be better). So, eventually all units will get treated, and in fact a unit can be treated in different years (see the "shock" variable in the code below). Because of the latter, I define the start of the treatment, the first year it gets treated.
My main question is how could I perform this staggered DID in Stata 16, and which tests should I do to be convinced about the power of the estimation. I mean, standard DID rely on the parallel trend assumption, but I assume that in my case, when there are no typical control units (never takers) this parallel trend assumption is useless, right? Is there any other test should I do for this staggered DID?
Many thanks for your help, below I put the treatment definition (and some possible control variables x2-x6) and some data example.
PS. I thought of using a two way fixed effects estimation (unit and years) for the raw x1 variable (number of implementation), but I am afraid someone could tell that this is a correlation analysis, not a causal one.
So, in order to define the treatment, what I do is to create the growth rate for the number of implementations and take as treatment if a unit of the panel in a given year is above the 75 percentile from its own growth rate in the whole period (if you believe there would be a better way to define the treatment, please tell me, maybe this is a strong way of defining treatment, and using the threshold for all unit would give some never taker units which could be better). So, eventually all units will get treated, and in fact a unit can be treated in different years (see the "shock" variable in the code below). Because of the latter, I define the start of the treatment, the first year it gets treated.
My main question is how could I perform this staggered DID in Stata 16, and which tests should I do to be convinced about the power of the estimation. I mean, standard DID rely on the parallel trend assumption, but I assume that in my case, when there are no typical control units (never takers) this parallel trend assumption is useless, right? Is there any other test should I do for this staggered DID?
Many thanks for your help, below I put the treatment definition (and some possible control variables x2-x6) and some data example.
PS. I thought of using a two way fixed effects estimation (unit and years) for the raw x1 variable (number of implementation), but I am afraid someone could tell that this is a correlation analysis, not a causal one.
Code:
bys id (year): gen gr_x1 = ( x1 - L.x1)/L.x1 // growth rate egen p75 = pctile(gr_x1), by(id) p(75) // 75 percentile for the shock gen shock = (gr_x1 > p75) & !missing(gr_x1, p75) // the shock (treatment) bysort id (year): egen first_shock_year = min(year) if shock == 1 bys id ( first_shock_year): replace first_shock_year = first_shock_year[1] sort id year gen pre_treatment = (year < first_shock_year) & !missing(first_shock_year) // pre treatment indicator gen event_time = year - first_shock_year // pre and post treatment periods br id year gr_x1 p75 shock first_shock_year pre_treatment event_time tab event_time, gen(event) * Not totally sure about the following for capturing what I want xtreg y1 L5.event1 L5.event2 L5.event3 L5.event4 L5.event5 L5.event6 L5.event7 L5.event8 L5.event9 L5.event10 L5.event11 L5.event12 L5.event13 L5.event14 L5.event15 L5.event16 L5.event17 L5.event18 L5.event19 L5.event20 L5.event21 L5.event22 L5.event23 L5.event24, fe robust test (L5.event1 L5.event2 L5.event3 L5.event4 L5.event5 L5.event6 L5.event7 L5.event8 L5.event9) test (L5.event10 L5.event11 L5.event12 L5.event13 L5.event14 L5.event15 L5.event16 L5.event17 L5.event18 L5.event19)
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float id int year float(y1 x1 x2 x3 x4 x5 x6) 1 2006 5 70 17.8 1.5 25451.15 279590 6 1 2007 5 97 15.9 . 26226.57 280521 4.9 1 2008 8 120 14.3 1.4 26298.87 281890 3.7 1 2009 9 115 12.1 . 25778.82 283210 2.8 1 2010 9 117 11.8 1.5 26232.1 284140 3.6 1 2011 7 152 12.8 2.4 26925.7 285182 4.3 1 2012 9 163 13.6 2.7 27304 286237 4.7 1 2013 11 125 13.9 3 27618.34 287055 4.2 1 2014 12 147 13.3 2.8 27650.34 287886 3.8 1 2015 7 153 13.8 2.9 27756.28 289684 4.1 1 2016 8 159 14.4 2.8 27999.32 291477 4.3 1 2017 4 158 12.8 2.6 28882.27 292310 4.2 1 2018 9 194 14.3 4.3 29055.78 293055 4.6 1 2019 10 176 13.6 4.4 29214.89 293936 4.9 1 2020 13 237 12.4 4.9 27289.89 295224 4.1 1 2021 9 237 12.6 5.3 27714.08 296798 4.4 1 2022 6 234 . . 28706.85 299418 . 9 2006 8 34 28 2 40744.37 363450 8.1 9 2007 20 70 25.7 2 42167.5 364924 7.5 9 2008 8 46 24.8 1.9 42611.79 366318 6.3 9 2009 21 126 25.4 2.2 40411.15 367712 6.8 9 2010 26 89 24.6 2.1 41212.26 368834 6.8 9 2011 33 101 26 2.6 42728.87 370100 7.4 9 2012 30 107 25.2 3.3 42766.69 371765 7 9 2013 26 56 25 3.6 43201.16 373943 7.5 9 2014 39 74 25.7 3.6 44120.01 376970 8 9 2015 39 99 26.5 3.7 44683.38 381370 8.8 9 2016 46 96 26.5 3.9 44448.77 386450 8.2 9 2017 32 94 26.2 4 45282.64 390247 8.3 9 2018 35 107 26.6 5.6 45482.29 393030 8 9 2019 35 135 26.4 5.6 45991.43 395719 7.6 9 2020 35 141 25.8 6.1 43010.1 398188 7.7 9 2021 36 102 28 6.7 44509.65 400470 9 9 2022 44 187 . . 45846.17 404035 . 18 2006 24 82 13.8 4.5 25617.93 1043900 3.8 18 2007 18 157 14.1 4.5 26282.68 1050569 4 18 2008 22 198 14.6 4.3 26345.05 1056880 3.6 18 2009 31 314 13.7 4.4 25682.22 1063850 4.1 18 2010 26 289 12.3 5 25930.05 1074912 3.2 18 2011 22 317 12.6 3.9 26408.26 1085070 3.8 18 2012 33 371 11.3 4.5 26380.38 1090610 3.5 18 2013 23 376 11.1 4.5 26235.4 1094428 3.4 18 2014 29 379 11.5 5.1 26114.16 1097290 3.5 18 2015 28 349 12 4.8 26311.34 1101180 3.4 18 2016 33 369 10.8 5.6 26435.88 1104760 3.2 18 2017 40 330 10.7 5.8 26875.31 1107180 2.4 18 2018 53 366 12.5 6.3 27218.43 1109190 4.3 18 2019 43 424 11.7 6.5 27540.48 1112000 4.6 18 2020 40 390 12.6 6.3 25756.9 1113635 4.5 18 2021 30 421 11.1 6 27733.52 1114707 3.3 18 2022 19 388 . . 27959.22 1119180 . 19 2006 8 12 11.1 4.9 23702.35 259900 2.5 19 2007 9 11 10.6 4.1 23825.54 262630 2.1 19 2008 6 26 10.9 4.5 23346.88 265520 2.3 19 2009 10 23 9.1 4.4 22721.22 267990 . 19 2010 6 27 11.2 4.9 23056.01 271115 1.9 19 2011 6 29 11 4.3 23267.58 274490 . 19 2012 4 47 9.8 4.6 22801.12 276700 2.3 19 2013 7 31 9.8 5.1 22672.03 277990 1.9 19 2014 10 24 9.6 4.6 22689.66 279200 2.1 19 2015 8 29 10 5.2 22962.53 281800 2 19 2016 5 32 10.7 5.1 22930.24 283890 2 19 2017 2 23 11 6.3 23312.2 284790 3.3 19 2018 9 38 9.4 5.5 23524.15 285800 2 19 2019 8 37 9.3 5.4 23763.1 288100 . 19 2020 5 29 9.5 5.7 22459.07 290710 2.2 19 2021 6 38 10.4 6.8 23755.92 293090 2.9 19 2022 11 21 . . 23610.62 295990 . 20 2006 1 45 11.5 4.9 24578.65 460300 3.3 20 2007 8 41 10.8 4.8 24826.82 463650 4 20 2008 5 81 10.6 4.1 25203.84 466994 3.6 20 2009 8 98 9.2 4.8 24625.63 470444 3.1 20 2010 5 71 9.4 4.6 25228.24 475834 3.2 20 2011 8 101 10 3.7 25311.65 481090 3.5 20 2012 7 105 8.5 4 25093.36 483800 2.9 20 2013 16 89 8.1 4 24823.52 485580 3.5 20 2014 17 96 8.7 3.9 25049.62 487600 3.3 20 2015 9 101 9.3 3.8 25153.56 490490 3.6 20 2016 13 80 9.4 4.7 25446.8 493080 3.7 20 2017 12 68 8.2 5.1 25665.7 494643 3.5 20 2018 12 102 8.9 6.3 25993.47 496000 3.1 20 2019 14 117 8.8 5.9 26175.86 498200 3.4 20 2020 8 107 7.6 6.3 24872.04 500010 2.8 20 2021 12 111 8.1 6.9 26201.57 501949 3.7 20 2022 8 101 . . 26679 504990 . 21 2006 0 0 25.5 1.4 3688.1 925533 4.3 21 2007 0 36 26.9 1.4 3806.31 907340 5.1 21 2008 0 72 27 1.3 3960.42 889973 4.7 21 2009 0 87 26 1.1 3773.6 873454 3.7 21 2010 0 76 23.4 1.1 3723.22 856236 2.5 21 2011 0 103 23.2 2 3771.95 841871 3 21 2012 0 115 24.3 2 3770.9 826741 3.8 21 2013 0 161 24.4 2 3807.34 809111 3.8 21 2014 0 132 24.9 2.5 4004.04 793267 3.3 21 2015 2 170 23.4 2.6 3965.7 777079 3.2 21 2016 1 214 23.9 2.3 4135.34 761015 4.4 21 2017 1 432 23.6 2.2 4405.68 744925 4.4 21 2018 3 342 22.7 2.6 4938.04 728773 4.2 21 2019 2 385 22.4 2.5 4864.97 712665 4.9 21 2020 0 369 21.7 2.5 4958.53 700453 4.8 end
Comment