Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dif in Dif method and assumptions.

    Hello all,
    I have looked a bit around on this forum, and I could not find anything like this question so hereby:


    For my thesis I am doing a dif in dif model on the effect of train stations on housing prices. I have repeated cross-sectional (transaction) data from 7 municipalities (2006-2018) of which 2 got a brand new train station in mid december 2012 (this is the treatment of course). The effect should then be visible in 2013, therefore salesyear>2012 in the code down below.
    My thesis is only on one of the municipalities (dronten) and I want to see the effect of the staiton on that train. The first question is what I should do with the other town?? any suggestions for that? Now it is in the control group, which does not really make sense (you get a treated variable in the control group, it got a second station). Yet this is a bit of a side question.

    Then this is my current code:
    generate after2012 = (salesyear>2012)
    generate dummydronten = (GEM_ID == 303)

    gen drontenafter2012 = after2012*dummydronten

    didregress (lnprijs lnage age2 lnm2 lnperceel NKAMERS Distance i.SOORTWONING i.salesyear i.ONBI i.ONBU i.GARAGE i.MONUMENTAAL i.ZOLDER i.ZWEMBAD) (drontenafter2012), group(GEM_ID) time(salesyear)

    I checked the parallel trends assumption by -estat trendplots- and I think its just usable and its met.
    Click image for larger version

Name:	dif in dif plot.png
Views:	1
Size:	71.3 KB
ID:	1753271


    But because I was not sure whether the assumption was met, I tried to get a numerical output on this with the following command:
    reg lnprijs i.dummydronten##c.salesyear
    margins dummydronten, dydx(salesyear)

    However, the treatment group could not be estimated. So that did not really work (I could not compare). So ill just assume that the dif in dif assumptions is met?

    Then another question, which is actually my main question is about other assumptions: what model is underlying in a dif in dif? If it is an OLS, which I assume, then I also assume the OLS assumptions must be met? How to do that, how to discover/solve for example heteroskedasticity (-hettest- forbreusch-pagan, -estat imtest, white-, and -rvfplot, yline(0)- do not work...). And I have got the same questions for endogeneity (error term correlated with independent variables) and multicollinearity. Do I need to check this for a simple dif in dif?

    Quite a few questions for now, but I hope someone can help me here.

    Best,
    Jesse
    Last edited by Jesse Luimes; 14 May 2024, 05:07.

  • #2
    reg lnprijs c.dummydronten#c.salesyear salesyear dummydronten if salesyear<=2012

    The first coefficient is a test of differences in trends. If it won't estimate, you've got a bigger problem.

    If it's a panel, I'd estimate

    areg lnprijs c.dummydronten#c.salesyear salesyear if salesyear<=2012 , absorb(panelid)

    As for the standard errors, how many cross sections do you have and how many are treated? 7 municipalities and 2 treated?

    If so, then I'd cluster the standard errors and use boottest for hypothesis testing. It's too few clusters, and a few treated clusters, so the standard SE are invalid (all of them).

    Comment


    • #3
      Dear mr Ford,
      Thank you very much for your reply, I'll definitely apply this.


      Originally posted by George Ford View Post
      reg lnprijs c.dummydronten#c.salesyear salesyear dummydronten if salesyear<=2012

      The first coefficient is a test of differences in trends. If it won't estimate, you've got a bigger problem. Thank you, ill definitely give it a shot. Thats really helpful

      If it's a panel, I'd estimate

      areg lnprijs c.dummydronten#c.salesyear salesyear if salesyear<=2012 , absorb(panelid)

      As for the standard errors, how many cross sections do you have and how many are treated? 7 municipalities and 2 treated? I have 7 municipalities in total, of which 2 are treated. However, I am only interested in the treated effect of one of those two municipalities, namely on Dronten. Total amount of observations/housing transactions = 80,000+ divided over those 7 municipalites.

      If so, then I'd cluster the standard errors and use boottest for hypothesis testing. It's too few clusters, and a few treated clusters, so the standard SE are invalid (all of them).
      Okay thats actually interesting, i now used vce (cluster), but I guess I'll switch now, because this makes sense.


      Thank you very much, explains a lot!
      Best,
      Jesse

      Comment


      • #4
        Dear Mr. Ford,

        A quick follow-up, I found that the equal trends assumption does not hold with the model in my first question. Both estat ptrends and estat granger reject the H0. However, the line of code you suggested does estimate. This is what I got.
        Linear regression
        lnprijs Coef. St.Err. t-value p-value [95% Conf Interval] Sig
        c -.005 .004 -1.45 .148 -.012 .002
        salesyear -.019 .001 -18.47 0 -.021 -.017 ***
        dummydronten 10.555 7.26 1.45 .146 -3.674 24.784
        Constant 49.734 2.032 24.48 0 45.752 53.716 ***
        Mean dependent var 12.221 SD dependent var 0.315
        R-squared 0.017 Number of obs 25552
        F-test 143.710 Prob > F 0.000
        Akaike crit. (AIC) 13127.801 Bayesian crit. (BIC) 13160.395
        *** p<.01, ** p<.05, * p<.1

        I just wanted to share this extra information, as I thought it might be good to know.

        Best,
        Jesse

        Comment


        • #5
          #4 does not include the interaction of dummydronten and salesyear. That's the coefficient you are interested in, as it quantifies the difference in the trends. I'd also center the cross sections using fixed effects and you should restrict to the pre-treatment period.

          or is that c? if so, you've rejected equal trends.

          if you have x's, then include those, as you're interested in the conditional trends.
          Last edited by George Ford; 16 May 2024, 08:14.

          Comment

          Working...
          X