Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • CSDID with Individual-Level Data - Treatment is at the Group Level

    Hello Statalist,

    I am using the Callaway & Sant’Anna (2021) Difference-in-Differences (DiD) estimator in Stata (csdid) to analyze the impact of a policy change. However, I am running into an issue due to the structure of my data and treatment assignment.

    My Data Structure:
    Unit of Observation: Individual-level (households).
    Treatment Assignment: Treatment is assigned at the province level.
    Time Variable: year_month.
    Treatment Timing Variable (gvar): The year_month when a province first received treatment (all individuals in a province share the same treatment timing).

    When I run the following command:
    csdid y, id(province) time(year_month) gvar(first_treat) method(dripw) notyet

    I get the error (duplicate time and gvar values).
    This is because of the duplicate values of gvar (treatment timing) and time (survey wave) across multiple individuals in the same province. Since csdid estimates group-time average treatment effects (GATTs), I am wondering:
    1. Do I need to collapse my data to the province-time level? If so, will this affect comparisons with a standard TWFE DiD model, which I am also running?
    2. Can csdid handle individual-level data when treatment happens at the province level? If so, how should I define gvar to avoid issues?
    Will something like this work?
    egen time_treated = csgvar(treatment), tvar(year_month) ivar(province_id)

  • #2
    I have a very similar challenge, and was hoping FernandoRios would provide some insight. I have been browsing StataList for a while and could not find a clear response to this yet.

    Comment


    • #3
      Well, The problem here is that the data its not Panel, but repeated crossection
      when you add ivar() you are telling csdid your data is panel. The command will try to verify that, and give the reported error
      Alternative...assume data its repeated crossection. In this case, only group fixed effects are considered, and province fixed effects will make little to no sense to add.
      Second alternative, use jwdid with the fevar options.
      HTH
      F

      Comment


      • #4
        Hi FernandoRios, thank you so much for your response. I am so sorry for the late reply. I fixed the problem and could run the command. I have one more problem. I also have state-level controls like GDP and the Housing Price Index, which are the same for every individual in a state at a point in time. So, I face a problem of multicollinearity. (Error in DRDID). But, I need the variables as control.
        The VIFs are also under 5. And the collinearity between the macro variables is 0.77

        Since GDP is only available at a yearly frequency, I use a GDP*time_trend variable. The Housing price index is available at a monthly frequency. I tried detrending the variables, but it didn't work. What can I do to address this?

        Can you provide some insights?

        Comment


        • #5
          You probably cant
          if drdid for a single case didn’t work you can try running each model by hand
          basically regress y x’s if pre treat , pre control, post treat, post control,
          if things dropout here you can’t add them to the main csdid

          Comment


          • #6
            Hi Fernando,
            Thank you so much for your insight. I won't be able to include the variables because there is no within-group variation. Every individual in a state would have the same value of the state-level covariate. However, a TWFE regression allows for that. So, how will I be able to justify dropping out macro covariates in my main specification? I haven't found any leads.

            Comment


            • #7
              Well, TWFE allowing for that is just an illusion or misspecification.
              But hard to say as it is a case by case issue

              Comment

              Working...
              X