Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thank you. I did not fully understand your code and what it does. The reason I need to use a DID-analysis is because I am trying to replicate the analysis of Giroud (2012), Proximity and investment: evidence from plant-level data, only with a different country. He used DID like this :

    To examine the effects on plant-level investment and productivity, I use a difference-in-differences approach. I estimate:

    Yijlt = ai + at + B x treatmentit + c Xijt + e

    Where i indexes plants, j indexes firms, l indexes plant location, t indexes years, Yijlt is the dependent variable of interest (plant investment or productivity), ai and at are plant and year fixed effects, treatment is a dummy variable that equals 1 if a new airline route that reduces the travel time between plant i and its headquarters has been introduced by time t, X is a vector of control variables, and e is the error term.The main coefficient of interest is B, which measures the effect of the introduction of new airline routes.

    I appreciate any help I have recieved and can get =)

    Comment


    • #17
      I admit now I'm a little confurse. The equation you have above, without Xijt, is exactly what I said to do in the first place. The ai are the fixed effects, the at are the year effects, and treamentit is the treatment variable. This is not a true "DID" because you cannot write the estimator as a difference in differenced means, but it is still called that. Well before DID it was a common way of using panel data for policy analysis. ai controls for differences across firms, at controls for secular changes across time, Xit controls for observed differences that might also be related to treatment assignment. Your mistake was looking at very specific, two-period DID analysis, which is a special case of the fixed effects approach.

      Comment


      • #18
        Hehe I appreciate getting the right answer right away, my problem is converting the regression I want to Stata-codes, and understanding what each part of the code does. What I first didn't understand was the B x treatment part, which I now understand and have done regressions on (without getting statistically significant results).

        The code you gave me, does that do all that to find Yijlt? Or do I need to do different analysis' to get the firm and year fixed effects? I do not fully understand the different elements of your code... =)

        Comment


        • #19
          It would have been so much easier if you started off by showing us this equation. What you have is referred to as a two-way fixed effects model.

          Code:
          Yijlt = ai + at + B x treatmentit + c Xijt + e
          As Jeff points out

          When T = 2 and there is a pre-treatment period for all units, the simple DID is the same as fixed effects estimation with a time dummy and the so-called interaction (the treatment dummy)
          I will show this using the example - and the Stata code. I strongly recommend that you read about fixed effects and understand what you are doing.

          Some notes: Estimating the model above amounts to including N-1 firm dummies and T-1 time dummies to to estimate the time invariant and firm invariant effects.


          Code:
          input ROA id  str8 route  year  
          1.775622   1   "A TO C"   1998
          3.83331    2   "A TO C"   1998
          -5.210526  3   "A TO C"   1998
          1.478725   4   "A TO C"   1998
          -13.73461  5   "A TO C"   1998  
          -8.754751  6   "A TO C"   1998
          -2.822808  7   "A TO C"   1998
          -.3456052  8   "A TO C"   1998
          4.453937   9   "A TO C"   1998
          -6.672886  10  "A TO C"   1998    
          9.433187   1   "A TO B"   2000
          8.438064   2   "A TO B"   2000
          9.211845   3   "A TO B"   2000
          8.478987   4   "A TO B"   2000
          17.824034   5   "A TO B"   2000
          1.470532   6   "A TO C"   2000
          -8.817384  7   "A TO C"   2000
          2.556952   8   "A TO C"   2000
          -4.566959  9   "A TO C"   2000
          3.711421   10  "A TO C"   2000
          end
          
          . reg  ROA i.id i.year  interaction
          
                Source |       SS       df       MS              Number of obs =      20
          -------------+------------------------------           F( 11,     8) =    1.29
                 Model |  714.428766    11  64.9480696           Prob > F      =  0.3669
              Residual |  402.570519     8  50.3213148           R-squared     =  0.6396
          -------------+------------------------------           Adj R-squared =  0.1440
                 Total |  1116.99928    19   58.789436           Root MSE      =  7.0938
          
          ------------------------------------------------------------------------------
                   ROA |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                    id |
                    2  |   .5312825   7.093752     0.07   0.942    -15.82694     16.8895
                    3  |  -3.603745   7.093752    -0.51   0.625    -19.96197    12.75448
                    4  |  -.6255484   7.093752    -0.09   0.932    -16.98377    15.73267
                    5  |  -3.559692   7.093752    -0.50   0.629    -19.91791    12.79853
                    6  |  -3.571822   7.770816    -0.46   0.658    -21.49136    14.34771
                    7  |  -5.749808   7.770816    -0.74   0.480    -23.66934    12.16973
                    8  |   1.175961   7.770816     0.15   0.883    -16.74357    19.09549
                    9  |   .0137767   7.770816     0.00   0.999    -17.90576    17.93331
                   10  |  -1.410445   7.770816    -0.18   0.860    -19.32998    16.50909
                       |
                  year |
                 2000  |   1.699335   4.486483     0.38   0.715    -8.646512    12.04518
                       |
          interaction |   11.34938   6.344845     1.79   0.111    -3.281854    25.98062
                 _cons |  -.9199552   5.494797    -0.17   0.871    -13.59098    11.75107
          ------------------------------------------------------------------------------

          Using a fixed effects estimator, the firm effects are wiped out (you do not have to worry about them). Unfortunately, Stata does not have a true two-way fixed effects estimator, so you have to manually add the time dummies

          Code:
          . xtset id year
                 panel variable:  id (strongly balanced)
                  time variable:  year, 1998 to 2000, but with gaps
                          delta:  1 unit
          
          . xtreg ROA i.year interaction, fe
          
          Fixed-effects (within) regression               Number of obs      =        20
          Group variable: id                              Number of groups   =        10
          
          R-sq:  within  = 0.5181                         Obs per group: min =         2
                 between = 0.6677                                        avg =       2.0
                 overall = 0.5552                                        max =         2
          
                                                          F(2,8)             =      4.30
          corr(u_i, Xb)  = 0.0547                         Prob > F           =    0.0539
          
          ------------------------------------------------------------------------------
                   ROA |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                  year |
                 2000  |   1.699335   4.486483     0.38   0.715    -8.646512    12.04518
                       |
           interaction |   11.34938   6.344845     1.79   0.111    -3.281854    25.98062
                 _cons |  -2.599959   2.243241    -1.16   0.280    -7.772883    2.572964
          -------------+----------------------------------------------------------------
               sigma_u |  2.2924625
               sigma_e |  7.0937518
                   rho |  .09456093   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0:     F(9, 8) =     0.21                Prob > F = 0.9848
          Bear in mind that 11.34938 is the coefficient we obtained with the two-period DID estimator. Note that at the moment, you do not have any control variables specified: if you add them later on, they must vary over time and across firms (otherwise two-way fixed effects cannot estimate the impact of time invariant and firm invariant variables). For example, if you have the following 2 control variables: population and income, you add them into the regression as follows

          Code:
          xtreg ROA i.year interaction population income, fe cluster(id)
          
          *and you have estimated the following equation:  Yit = ai + at + B x treatmentit + c Xit + e
          Last edited by Andrew Musau; 11 May 2015, 09:38.

          Comment


          • #20
            Thank you so much, the both of you! It is true that I do not yet grasp all the finer nuances of statistical analysis (and I am not a native English speaker), but your help has been invaluable in trying to fit all the pieces together to understand it. Thank you!

            Comment


            • #21
              Hello everybody. My research is pretty similar to Susanne's. I am trying to quantify the effect of the Announcement-and-Effective Construction of a new transit line (called RER) on real estate's prices. I have quarterly panel data for each year from 1990 to 2014. I'm using a difference in difference methodology. The new transit line will go trough 30 counties (treatment group) and I have 62 other counties (control group). I thus have 25 year of quarterly data over 1° house prices (which is my dependant variable) 2°A lot of control variables about the counties like the density of foreigners, the density of inhabitants per kilometer, the average wages of a county,... (I have all this data for all years). The Announcement took place in 1999 and the effective construction in 2004. Thus I created a dummy variable RER which is equal to one (for all years) for counties where the new transit line will go through. And two other dummy variable Announce (=1 from 1999 to 2014) and Construction (=1 from 2004 to 2014) and 2 Interaction variable; RER*Announcement and RER*Construction. I want to take into account time fixed effects and counties fixed effect.

              What model/estimator should I use?

              option 1: A Fixed effect Model

              In Stata 14.0 I have to write;

              gen Announcerer = Annonce*RER
              gen Constrer = Construction*RER
              encode time, gen(time2)
              xtset ID time2, quarterly

              xtreg lprices RER Announcement Construction Annoucerer Constrer wages densityperkm foreignersdensity i.time2, fe vce cluster(ID)

              option 2: A Random effect Model

              xtreg lprices RER Announcement Construction Annoucerer Constrer wages densityperkm foreignersdensity i.time2, vce cluster(ID)

              option 3: A simple DID Model

              reg lprices RER Announcement Construction Annoucerer Constrer wages densityperkm foreignersdensity i.time2, r


              Questions:

              -In Wooldridge's book (Introduction to Econometrics, 2014) it says on p.398 "Fixed effects allows arbitrary correlation between ai and xitj while random effects does not" he Hausman test tells me to choose option 2, the random effect model. However, I have reasons to believe that the ai and xitj are correlated. (For example, the distance to the capital and biggest city in the country Brussels which is time invariant and thus in the ai might be correlated with my explanatory variable density of population. Should I believe the Hausman test or my intuition?
              -What about option 3? In Woolridge's book, there is a similar example with the construction of an incinerator on p. 366 and he used a difference-in-difference estimator. However, there is only 2 years of observation, I have 25...
              -I have some problem with econometrical terminology. If I am using option 1, should I say that I am using a difference of difference model, a fixed effect model or a fixed effect model with a difference in difference methodology?

              Thank you

              Comment


              • #22
                Hi Andrew,

                I happened to read this post and I'm doing something similar. I read the "Proximity and investment: evidence from plant-level data" paper, and I think their model
                Yijlt = ai + at + B x treatmentit + c Xijt + e
                is different from the one in #19
                xtreg ROA i.year interaction population income, fe cluster(id)
                *and you have estimated the following equation: Yit = ai + at + B x treatmentit + c Xit + e
                Because the interaction variable should be treatment*post (as you wrote in #12), so it is different from B x treatmentit the paper uses. I wonder which is correct?

                Thanks!

                Comment


                • #23
                  The treatment indicator is always defined as equal to one if unit \(i\) at time \(t\) was subject to the treatment and zero otherwise. Therefore, for treated units, it is zero for pre-treatment years and turns on (changes to one) once the treatment is initiated. So in TWFE, the treatment indicator is not defined as a treated unit. If it were, you would not be able to identify its coefficient as it would be collinear with the unit fixed effects. This simple example illustrates: Suppose that firms 1, 3 and 5 were treated in the Grunfeld dataset and we define the treatment indicator as treated firms. We would not be able to get a coefficient on the treatment indicator as its identification relies on the existence of a pre-treatment period for the treated firms.

                  Code:
                  webuse grunfeld, clear
                  gen treatment= inlist(company, 1, 3, 5)
                  xtreg invest mvalue kstock treatment i.time, fe
                  Res.:

                  Code:
                  . xtreg invest mvalue kstock treatment i.time, fe
                  note: treatment omitted because of collinearity
                  
                  Fixed-effects (within) regression               Number of obs     =        200
                  Group variable: company                         Number of groups  =         10
                  
                  R-sq:                                           Obs per group:
                       within  = 0.7985                                         min =         20
                       between = 0.8143                                         avg =       20.0
                       overall = 0.8068                                         max =         20
                  
                                                                  F(21,169)         =      31.90
                  corr(u_i, Xb)  = -0.3250                        Prob > F          =     0.0000
                  
                  ------------------------------------------------------------------------------
                        invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                        mvalue |   .1177158   .0137513     8.56   0.000     .0905694    .1448623
                        kstock |   .3579163    .022719    15.75   0.000     .3130667    .4027659
                     treatment |          0  (omitted)
                               |
                          time |
                            2  |  -19.19741   23.67586    -0.81   0.419    -65.93593    27.54112
                            3  |  -40.69001   24.69541    -1.65   0.101    -89.44122    8.061213
                            4  |   -39.2264   23.23594    -1.69   0.093    -85.09647    6.643667
                            5  |  -69.47029   23.65607    -2.94   0.004    -116.1698   -22.77083
                            6  |  -44.23507   23.80979    -1.86   0.065      -91.238     2.76785
                            7  |  -18.80446     23.694    -0.79   0.429     -65.5788    27.96987
                            8  |  -21.13979   23.38163    -0.90   0.367    -67.29748    25.01789
                            9  |  -42.97762   23.55287    -1.82   0.070    -89.47334    3.518104
                           10  |  -43.09876    23.6102    -1.83   0.070    -89.70766    3.510134
                           11  |  -55.68303   23.89561    -2.33   0.021    -102.8554   -8.510689
                           12  |  -31.16928   24.11598    -1.29   0.198    -78.77665    16.43809
                           13  |  -39.39223   23.78368    -1.66   0.100    -86.34361    7.559141
                           14  |  -43.71651   23.96965    -1.82   0.070    -91.03501    3.601991
                           15  |   -73.4951   24.18292    -3.04   0.003    -121.2346   -25.75559
                           16  |  -75.89611   24.34553    -3.12   0.002    -123.9566    -27.8356
                           17  |   -62.4809   24.86425    -2.51   0.013    -111.5654   -13.39637
                           18  |  -64.63233    25.3495    -2.55   0.012    -114.6748   -14.58987
                           19  |  -67.71796   26.61108    -2.54   0.012    -120.2509   -15.18501
                           20  |  -93.52622   27.10786    -3.45   0.001    -147.0399   -40.01257
                               |
                         _cons |  -32.83631   18.87533    -1.74   0.084     -70.0981    4.425483
                  -------------+----------------------------------------------------------------
                       sigma_u |  91.798268
                       sigma_e |  51.724523
                           rho |  .75902159   (fraction of variance due to u_i)
                  ------------------------------------------------------------------------------
                  F test that all u_i=0: F(9, 169) = 52.36                     Prob > F = 0.0000
                  
                  .
                  Last edited by Andrew Musau; 27 Mar 2022, 07:55.

                  Comment

                  Working...
                  X