Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in differences with treatment at different points in time

    Hello everyone! I am kind of new with Stata, so I apologize in advance if my questions are too basic.

    I am trying to analyze the effect of a smoking ban in smoking rates in several cities. The issue is that the treatment in one city starts one period before than the treatment in the other cities. My data set looks like this:
    city year smoking rates
    1 2006 30.11%
    1 2009 18.97%
    1 2012 9.93%
    1 2014 15.53%
    2 2006 18.93%
    2 2009 12.84%
    2 2012 7.15%
    2 2014 11.70%
    3 2006 25.02%
    3 2009 26.30%
    3 2012 17.80%
    3 2014 8.65%
    4 2006 26.34%
    4 2009 31.17%
    4 2012 15.82%
    4 2014 10.32%
    5 2006 22.42%
    5 2009 11.52%
    5 2012 6.39%
    5 2014 2.97%
    6 2006 8.58%
    6 2009 23.66%
    6 2012 19.32%
    6 2014 8.56%
    7 2006 33.16%
    7 2009 21.72%
    7 2012 17.34%
    7 2014 10.99%
    City 1, 2, 3 and 4 had the treatment ... for city 1 the treatment started in 2012, for the rest cities, in 2014.
    City 5, 6 and 7 are the control group.

    I tried to follow this example http://www.princeton.edu/~otorres/DID101.pdf which works with several periods, but the treatment starts at the same period for all the countries. Hence, if I generate a variable period = 1 if year>= 2012, and then another variable treatment = 1 if city<=4 then for the cities that started treatment in 2014 will be wrong, since the interaction will show that they started treatment in 2012, and that's not the case.

    I hope I was clear on my explanation, and I really appreciate your help.

    Thank you.


  • #2
    You are trying to apply standard D-in-D analysis to a study that does not have a D-in-D design. It can't be done.

    I see two ways you can move forward. The simplest is to simply drop City 1 from the analysis. If you do that, you will have a standard D-in-D design where the treatment starts at the same time in all treated groups.

    If you really can't do that for some reason, you may be able to salvage the situation with a different approach that relies on a non-standard D-in-D analysis. This approach requires imputing a "would have started treatment date if we were in the treatment group" to each control city. That imputed would-have-started date would be either 2012 or 2014 and your period variable for control cities is then calculated as period = (year >= would_have_started_date), rather than (year >= some constant year). The problem with this approach is that you need some credible basis for saying that some cities "would have started" in 2012 and others in 2014. If, for example, they considered such legislation in one of those years but rejected it, that would be a way to do it.

    Finally, you can consider abandoning D-in-D analysis and just try to include as many potential confounding variables as possible in a simple treated vs control group regression (omitting the pre-treatment data in the treatment cities, and probably including only data from 2012 on in the controls.)

    Comment


    • #3
      Thank you for your help Mr. Schechter! I think I'll go for the 1st or 3rd option.

      Best regards,

      Comment

      Working...
      X