Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xthdidregress- dealing with person-year data and midyear treatments

    Good afternoon everyone,
    I'm looking to run a DiD model with staggered treatment timing on panel data, and have been planning on using xthdidregress. The dataset is individual level survey data, and I'll be clustering at the state level for causal inference on a state level policy. Example data structure is below:

    Code:
    personid year  state survey_month  ever_treated  date_treatment treatment_exposed outcome
    1 --------2009  AL ------05/09-----------0-------------- .-------------0 -------------0
    1 --------2010  AL ------06/10-----------0-------------- .-------------0--------------0
    2---------2009  OR------04/09------------1------------ 05/10---------- 0--------------0
    2---------2010  OR -----06/10------------1------------ 05/10-----------1--------------1
    2---------2010  OR------06/10------------1------------ 05/10-----------1--------------1

    Where treatment_exposed=1 if ever_treated=1 & survey_month>date_treatment (which I think is where the problem is)
    I have xtset by PERSONID and year successfully

    My simplified code for the DD model is as follows:
    Xthdidregress ra (outcome [covariates]) treatment_exposed, group(state) vce(cluser PERSONID) controlgroup(never)

    And I get the following an error message stating "treatment variable varies at the state year level" and that the model cannot be run. I assume this is because I am specifying treatment exposure precisely (by month) but have xtset by year I'm getting some observations in a state year where an individual is exposed to treatment (ex. in Oregon 2010 after May 1) and some where they are not exposed (ex. Oregon 2010 Jan-April).

    For folks more experienced in DiD models, how would you recommend dealing with this problem? I suppose I could coarsen the treat variable (ex. code everyone in the state as exposed if they're surveyed in the same year the treatment happens), although it would be unfortunate to intentionally miscategorize some people as treated if their data comes from a few months before the treatment occurs.

    Many thanks!
    Andy
    Last edited by Andy Hyatt; 07 Jun 2024, 15:15.

  • #2
    Andy: Why aren't you using xtset PERSONID month? If you want to exploit the full power of the staggered intervention, you should set your data at the time series frequency, which is monthly. Now, you'll have to define a unique month variable, which you could do by concatenating to create yearmonth.

    Comment


    • #3
      Hi Jeff, thanks for the quick reply! This was something I had considered to solve another issue with the dataset, but had hesitated because I worried it might affect the interpretation of the ATET (ex. giving a treatment effect on a monthly level, as I expect effects to be dynamic and likely increase over time). I suppose this would be fairly easy to correct on the back end, so I'll try to change the way I'm coding for time and report back. Thanks for the suggesting!

      Comment


      • #4
        Let me also comment that Fernando Rios-Avila's jwdid command has the option to aggregate the effects nicely, so you wouldn't have some many estimates to study.

        Comment

        Working...
        X