Hello Statalist,
I am using the Callaway & Sant’Anna (2021) Difference-in-Differences (DiD) estimator in Stata (csdid) to analyze the impact of a policy change. However, I am running into an issue due to the structure of my data and treatment assignment.
My Data Structure:
Unit of Observation: Individual-level (households).
Treatment Assignment: Treatment is assigned at the province level.
Time Variable: year_month.
Treatment Timing Variable (gvar): The year_month when a province first received treatment (all individuals in a province share the same treatment timing).
When I run the following command:
csdid y, id(province) time(year_month) gvar(first_treat) method(dripw) notyet
I get the error (duplicate time and gvar values).
This is because of the duplicate values of gvar (treatment timing) and time (survey wave) across multiple individuals in the same province. Since csdid estimates group-time average treatment effects (GATTs), I am wondering:
egen time_treated = csgvar(treatment), tvar(year_month) ivar(province_id)
I am using the Callaway & Sant’Anna (2021) Difference-in-Differences (DiD) estimator in Stata (csdid) to analyze the impact of a policy change. However, I am running into an issue due to the structure of my data and treatment assignment.
My Data Structure:
Unit of Observation: Individual-level (households).
Treatment Assignment: Treatment is assigned at the province level.
Time Variable: year_month.
Treatment Timing Variable (gvar): The year_month when a province first received treatment (all individuals in a province share the same treatment timing).
When I run the following command:
csdid y, id(province) time(year_month) gvar(first_treat) method(dripw) notyet
I get the error (duplicate time and gvar values).
This is because of the duplicate values of gvar (treatment timing) and time (survey wave) across multiple individuals in the same province. Since csdid estimates group-time average treatment effects (GATTs), I am wondering:
- Do I need to collapse my data to the province-time level? If so, will this affect comparisons with a standard TWFE DiD model, which I am also running?
- Can csdid handle individual-level data when treatment happens at the province level? If so, how should I define gvar to avoid issues?
egen time_treated = csgvar(treatment), tvar(year_month) ivar(province_id)
Comment