Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identifying assumption in fixed effects model

    Hello,

    I am running a model with county fixed effects and time fixed effects using repeated cross-sectional data. My treatment variable is a three-level variable and thus, the set-up of my model is not exactly a difference-in-difference model, I think. But let me know if I am wrong.

    I wanted to ask what would be the identifying assumption in this model. In DID, we need to meet the common trends assumption but in this case since I don't have a binary treatment indicator, I was not sure how the common trends assumption would look like. My understanding is that conditional on county and time fixed effects and other controls I include, the error term (epsilon_ict when i = individual, c= county, t=time) is uncorrelated with the treatment variable. Is this correct? If so, is this assumption different from the common trends assumption when I have a three-level treatment variable rather than a binary variable?

    To give you a better understanding of my model, here is the code I'm using:

    Code:
    svy: reg outcome i.lag_treatment i.county i.time i.covariates
    The covariates include individual characteristics as well as county-specific linear time trends. The survey set up takes into account the clustering of standard errors.

    I would really appreciate it if you can provide me with some help on this. Thank you in advance!

  • #2
    My understanding is that conditional on county and time fixed effects and other controls I include, the error term (epsilon_ict when i = individual, c= county, t=time) is uncorrelated with the treatment variable.
    That is an assumption that is needed to support consistent estimation of the effects in your model. But it is not an "identifying" assumption in the sense of enabling you to consider your model's estimates as being a causal effect.

    The identifying assumption in DID is, as you note, common trends prior to intervention. With a three level treatment variable, the identifying assumption is just the same thing generalized to three groups: the trends prior to intervention must be the same in all levels of the treatment variable.

    I cannot tell if the code you show represents a proper model for DID because you do not explain how the variable lag_treatment was calculated. If it only represents which treatment arm the entity was in, then it is an incorrect model. A DID model must have an interaction between treatment group and a time variable that distinguishes pre- and post-intervention periods. If the variable lag_treatment is, itself, such an interaction, then the regression looks OK.

    Comment


    • #3
      Hello Dr. Schechter,

      Thanks for your response. My treatment variable is coded as 0 (no treatment), 1 (treatment 1), and 2 (treatment 2) and lag_treatment is itself an interaction of time and treatment group.

      To give you a more background of my model, all of my variables including the outcome come from the same data set that has three years, while my treatment variable comes from a different data set. For instance, if I have 2003, 2005, and 2007 for my outcome variable, my treatment is lagged by a year and thus is measured in 2002, 2004 and 2006. Also, counties move between treatments over time. A few possible scenarios are being treated like: 0-1-2, 0-2-1, 0-2-2, 0-1-1, and so on (but counties are always not treated in 2002). All counties are eventually treated, at least by treatment 1 or treatment 2.

      You mentioned that the identifying assumption is having parallel trends before the intervention -- and under this set-up of my model, I'm confused for what groups we should be expecting the parallel trends given that counties move between treatments. Should we have parallel trends between counties that receive treatment 1 (at least once) and those that receive treatment 2? I'd appreciate it if you can provide some insights!

      Comment


      • #4
        Well, this design makes it complicated. If the scenarios you show are the only possible ones, then you are on pretty thin ice here. You need some scenarios like 0-0-1 and 0-0-2. Then you can look at what happens during the 0-0 times of those who are ultimately 0-0-0, those who are ultimately 0-0-1, and those who are ultimately 0-0-2. Those should show parallel trends in order to support causal inference. If you don't have 0-0-x for all x = 0,1, 2, then you can't really establish parallel trends, and I think that you cannot make any causal claims based on this data.

        Comment


        • #5
          Ah, I see. Thank you for your help. One follow-up question is, if the model meets the assumption I mentioned above,

          My understanding is that conditional on county and time fixed effects and other controls I include, the error term (epsilon_ict when i = individual, c= county, t=time) is uncorrelated with the treatment variable.
          will the coefficient on the treatment variable at least be consistent?

          Comment


          • #6
            Yes.

            Comment

            Working...
            X