Hi all,
I'm currently trying to run a difference-in-differences estimate for a set of data with 5,690 weekly observations over 109 units. The dataset is unbalanced - I have data from different years for each unit.
The treatment variable (fortpts_eos) is a dummy, but it varies in its timing between different treated units (i.e. takes the value '1' at different times), and also doesn't persist into future observations (i.e. returns to '0').
In total, I have 61 treated units and 48 control units - controls are where fortpts_eos = 0 for all observations.
I have tried to do difference-in-differences manually using a 'post' and 'treat' variable multiplied together, but the variability in treatment timing makes it impossible to create a comparison with the control group, as far as I understand it.
Instead, I have tried to use didregress or xtdidregress, which produce results but I am not convinced they actually do what I envisaged them doing.
didregress seems to just run a regression with group and time fixed effects - is this correct, and the correct way of approaching DiD for my data?
I have attempted to use dataex with a small subset below, which hopefully works. season_no defines the different units; gt_web_pl is the dependent variable. Observations are weekly. season_no = 1,2 are treated units, while season_no = 62, 63 are examples of control units.
Any help would be very much appreciated - I am relatively inexperienced in the world of DiD especially in this case. Thanks!
I'm currently trying to run a difference-in-differences estimate for a set of data with 5,690 weekly observations over 109 units. The dataset is unbalanced - I have data from different years for each unit.
The treatment variable (fortpts_eos) is a dummy, but it varies in its timing between different treated units (i.e. takes the value '1' at different times), and also doesn't persist into future observations (i.e. returns to '0').
In total, I have 61 treated units and 48 control units - controls are where fortpts_eos = 0 for all observations.
I have tried to do difference-in-differences manually using a 'post' and 'treat' variable multiplied together, but the variability in treatment timing makes it impossible to create a comparison with the control group, as far as I understand it.
Instead, I have tried to use didregress or xtdidregress, which produce results but I am not convinced they actually do what I envisaged them doing.
didregress seems to just run a regression with group and time fixed effects - is this correct, and the correct way of approaching DiD for my data?
I have attempted to use dataex with a small subset below, which hopefully works. season_no defines the different units; gt_web_pl is the dependent variable. Observations are weekly. season_no = 1,2 are treated units, while season_no = 62, 63 are examples of control units.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input int(date season_no) byte(gt_web_pl fortpts_eos) 20505 1 18 0 20512 1 41 0 20519 1 24 0 20526 1 23 0 20533 1 29 0 20540 1 20 0 20547 1 19 0 20554 1 19 1 20561 1 62 1 20568 1 16 1 20575 1 18 1 20582 1 13 1 20589 1 100 1 20596 1 12 0 20603 1 10 0 20610 1 9 0 20617 1 12 0 20624 1 14 0 20631 1 12 0 20855 2 21 0 20862 2 44 0 20869 2 17 0 20876 2 46 0 20883 2 29 0 20890 2 28 0 20897 2 13 0 20904 2 22 0 20911 2 97 0 20918 2 36 0 20925 2 25 0 20932 2 18 0 20939 2 18 1 20946 2 21 1 20953 2 16 1 20960 2 23 1 20967 2 13 0 22073 62 5 0 22080 62 22 0 22087 62 23 0 22094 62 93 0 22101 62 48 0 22108 62 75 0 22115 62 40 0 22122 62 36 0 19070 63 45 0 19077 63 67 0 19084 63 51 0 19091 63 39 0 19098 63 72 0 19105 63 71 0 19112 63 48 0 end format %tdnn/dd/CCYY date