Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in difference analysis with more than two time periods.

    I have a panel datatset for 15 years and 6721 villages. The years are from 2000 to 2014. I am trying to do a difference in differences analysis to evaluate the impact of my treatment variable (treatment) on the outcome variable (y).
    The treatment started in different years for different villages. Some started in 2009, some in 2010 and others in 2011. I have defined my treatment group as those villages that received the treatment in 2009 or 2010. The rest of the villages are control. (I have already tested the parallel trends assumption)

    Since there are multiple pre treatment and post treatment time periods, I don't know how to define my post treatment indicator variable.
    This is what I have used so far:

    Code:
    gen post = 1 if time>=2009
    replace post=0 if time<2009
    gen postXtreatment = post*treatment
    reg y post treatment postXtreatment, r
    Is this specification correct?
    Is there an alternative way to do this, that would incorporate the fact that there are multiple time periods? Perhaps by creating a continuous variable for time that takes values from 1 to 15 for the different years, and then taking the interaction?

    Also, Is there a way that I can capture the fact that different villages received treatment for different number of years. i.e The ones that were treated in 2009, received treatment for 5 years, but the ones that were treated in 2010 received treatment for 4 years.

    I would be grateful for any help!

    Kuhu

  • #2
    Kuhu:
    welcome to the list.
    As far as I can get your problem, I would suggest:
    Code:
    gen post=1 if time>=2009 & time!=.///otherwise, if you have missing years, the dummy will replace -.- with 1
    replace post=0 if time<2009
    xtreg y i.post##i.treatment, re///there's no need to create interaction by hand, as -fvvarlist- will do it for you, plugging in your panel data regression the main conditional effects of the two predictors included in the interaction, too.
    I also assume that you meant -re- specification in your previous code (although it seems to recall an OLS rather than a panel data regession),
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Thanks Carlo!
      xtreg is definitely what I should use instead of OLS.

      Comment


      • #4
        In terms of accounting for the multiple years, if you have reason to believe that there can be time trends operating, then they should be taken into account. One way to do that is with a linear spline with a knot at the year that the treated group begins treatment. So something like this:

        Code:
        mkspline pre_treatment 2009 post_treatment = year
        xtreg y i.treatment##(c.pre_treatment c.post_treatment), fe // OR re IF YOU PREFER
        margins treatment, dydx(pre_treatment post_treatment)
        In this model what you are estimating are the slopes of y vs year in each group during the pre-treatment and post-treatment eras. If the two groups are parallel before treatment, than their pre-treatment slopes will be approximately the same and the coefficient of 1.treatment#pre_treatment will be approximately 0. If they diverge after the start of treatment, then the 1.treatment#post_treatment coefficient will be large, and the two post_treatment slopes will differ appreciably.

        If your time frame also includes years where there are specific shocks (e.g. perhaps a financial crisis in 2008-2010) you can also add specific indicator variables for those years to adjust for that.

        Note also that it is not necessary in Stata to calculate interaction terms by multiplying variables. Factor-variable notation (the # and ## operators) do that for you automatically, and enable you to use the -margins- command afterwards to simplify interpretation. See -help fvvarlist-.

        Comment


        • #5
          Another suggestion is to look at the following paper

          Minimum Wage Effects across State Borders: Estimates using Contiguous Counties; Dube, Lester and Reich, 2010, Review of Economics and Statistics
          http://www.mitpressjournals.org/doi/...9#.V4jgrzVqv9A

          It uses a panel-like approach to pool the effects of many treatments at different moments in time. If you have many treatments, this might be a more interesting framework than standard diff-in-diff.

          Comment


          • #6
            Hi everyone,

            Thankyou for your help.

            I am still struggling with how to account for the fact that different villages received treatment for a different number of years. i.e The ones that were treated in 2009, received treatment for 5 years, but the ones that were treated in 2010 received treatment for 4 years.
            I want to be able to consider all these villages as treatment groups, but with varying treatment intensities. Essentially, I want to see if the villages that started treatment earlier, perform better than the ones that started treatment later.

            Any insights on how to do this will be greatly appreciated.

            Best,
            Kuhu

            Comment


            • #7
              P.S: To be even more specific, I want to have a continuous treatment variable, taking values from 0 to 5 that denotes the number of years an individual received treatment.

              Comment


              • #8
                When I responded earlier, I didn't focus on the fact that you have different people starting treatment at different time periods. For some reason, I misunderstood and thought that treatment began in 2009 for everybody.

                What you have is a difficult situation. This is not a difference-in-differences design and you may not be able to do a DID analysis for it. The problem is that your pre-post variable must also be specified for the control group, who do not receive the treatment. When everybody in treatment starts treatment in 2009, that is easy. But if different people start at different times, then what defines the pre-post variable for the control group. Sometimes there is something in the context of the study that enables you to identify a time point when a control individual "would have started treatment if he/she were part of the treatment group." If you can identify such a time, then the pre-post variable has to be calculated so that it distinguishes before and after that time. If there is no way to do that, then you might be able to salvage it by creating matched pairs of treatment and control individuals and setting the pre-post variable to be the same in both members of the pair. The feasibility of this depends on your ability to come up with a credible matching scheme, and the plausibility of the notion that the matched control "would have started treatment at the same time as the treated member of the pair if he/she were in the treatment group." A third approach, weaker than the other two, is to randomly assign controls to "would have started treatment" dates in the same proportions as the starting dates observed in the actual treatment group. (This last approach is a bit like stratifying the study based on start-date of the treatment group, each stratum then getting corresponding controls, and then pooling the data together.)

                As for wanting your treatment to be a continuous variable, in effect a "dose" variable, you don't have to change much. Just use c.treatment instead of i.treatment in the modeling code.

                Comment


                • #9
                  Thankyou Clyde! That was very helpful.

                  I think I will go with a dose variable, since the former doesn't seem feasible for me. You're right, it's a pretty difficult situation.

                  Best,
                  Kuhu

                  Comment


                  • #10
                    Hello Kuhu,

                    I think this paper may be interesting for you "http://www.jstor.org/stable/pdf/10.1086/344122.pdf"

                    Comment

                    Working...
                    X