Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diff vs xtreg results

    I am trying to perform a DiD with three different methods. Here is an example of my dataset (the complete data is attached), where "regime" is the treatment that starts in 2007 and goes until 2019:

    Code:
    code_7    regime    ano    cap_1
    1100015    0    2002    1
    1100015    0    2004    0
    1100023    0    2000    5
    1100023    0    2001    4
    1100023    0    2002    2
    1100023    0    2003    4
    1100023    0    2004    1
    1100023    0    2005    1
    1100023    0    2006    1
    1100023    0    2007    0
    1100023    0    2008    4
    1100023    0    2009    1
    1100023    0    2010    1
    1100023    0    2011    3
    1100023    0    2012    1
    1100023    0    2013    2
    1100023    0    2014    1
    1100023    0    2015    0
    1100031    0    2000    1
    1100031    0    2001    0
    1100031    0    2002    0
    1100031    0    2003    
    1100031    0    2004    0
    1100031    0    2005    
    1100031    0    2006    0
    1100031    0    2007    
    1100031    0    2008    0
    1100031    0    2009    0
    1100031    0    2010    0
    1100031    0    2011    0
    1100031    0    2012    0
    1100031    0    2013    
    1100031    0    2014    0
    1100031    0    2015    0
    1100056    0    2000    1
    1100056    0    2001    0
    1100056    0    2002    0
    1100056    0    2003    3
    1100056    0    2004    0
    1100056    0    2005    0
    1100056    0    2006    0
    1100056    0    2007    1
    1100056    0    2008    1
    1100056    0    2009    0
    1100056    0    2010    0
    1100056    0    2011    0
    1100056    0    2012    0
    1100056    0    2013    0
    1100056    0    2014    1
    1100056    0    2015    
    1100064    0    2000    0
    1100064    0    2001    0
    1100064    0    2002    0
    1100064    0    2003    0
    1100064    0    2004    1
    1100064    0    2005    0
    1100064    0    2006    1
    1100064    0    2007    0
    1100064    0    2008    0
    1100064    0    2009    0
    1100064    0    2010    0
    1100064    0    2011    2
    1100064    0    2012    0
    1100064    0    2013    0
    1100064    0    2014    0
    1100064    0    2015    0
    1100072    0    2006    0
    1100072    0    2007    
    1100072    0    2008    
    1100072    0    2009    0
    1100072    0    2010    0
    1100072    0    2011    1
    1100072    0    2012    
    1100072    0    2013    0
    1100072    0    2014    0
    1100072    0    2015    
    1100080    0    2000    1
    1100080    0    2001    0
    1100080    0    2002    1
    1100080    0    2003    1
    1100080    0    2004    2
    1100080    0    2005    1
    1100080    0    2006    0
    1100080    0    2007    1
    1100080    0    2008    0
    1100080    0    2009    0
    1100080    0    2010    1
    1100080    0    2011    0
    1100080    0    2012    0
    1100080    0    2013    0
    1100080    0    2014    2
    1100080    0    2015    0
    1100098    0    2000    1
    1100098    0    2001    1
    1100098    0    2002    1
    1100098    0    2003    1
    1100098    0    2004    1
    1100098    0    2005    0
    1100098    0    2006    1
    1100098    0    2007    1
    1100098    0    2008    1
    1100098    0    2009    0
    1100098    0    2010    0
    1100098    0    2011    1
    1100098    0    2012    1
    1100098    0    2013    0
    1100098    0    2014    2
    1100098    0    2015    0
    1100114    0    2000    2
    1100114    0    2001    1
    1100114    0    2002    2
    1100114    0    2003    2
    1100114    0    2004    1
    1100114    0    2005    2
    1100114    0    2006    6
    1100114    0    2007    3
    1100114    0    2008    1
    1100114    0    2009    0
    1100114    0    2010    1
    1100114    0    2011    1
    1100114    0    2012    1
    1100114    0    2013    1
    1100114    0    2014    2
    1100114    0    2015    0
    1100122    0    2000    6
    1100122    0    2001    2
    1100122    0    2002    5
    1100122    0    2003    3
    1100122    0    2004    3
    1100122    0    2005    2
    1100122    0    2006    1
    1100122    0    2007    2
    1100122    0    2008    2
    1100122    0    2009    2
    1100122    0    2010    1
    1100122    0    2011    6
    1100122    0    2012    0
    1100122    0    2013    4
    1100122    0    2014    1
    1100122    0    2015    1
    1100130    0    2000    0
    1100130    0    2001    0
    1100130    0    2002    0
    1100130    0    2003    0
    1100130    0    2004    2
    1100130    0    2005    0
    1100130    0    2006    2
    1100130    0    2007    1
    1100130    0    2008    2
    1100130    0    2009    2
    1) I followed https://www.princeton.edu/~otorres/DID101.pdf in order to conduct the first exercise.

    Code:
    xtset code_7 ano
    
    gen time = (ano>=2007) & !missing(ano)
    
    gen treated = (regime>0) & !missing(regime)
    
    gen did = time*treated
    
    xtreg lncap_1 did regime, fe
    I got the following result:

    Code:
    Fixed-effects (within) regression               Number of obs     =      8,865
    Group variable: code_7                          Number of groups  =      2,203
    
    R-sq:                                           Obs per group:
         within  = 0.0315                                         min =          1
         between = 0.0018                                         avg =        4.0
         overall = 0.0124                                         max =         19
    
                                                    F(2,6660)         =     108.20
    corr(u_i, Xb)  = -0.0639                        Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
         lncap_1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             did |  -.2287338   .0697029    -3.28   0.001    -.3653738   -.0920938
          regime |  -.1961217   .0680113    -2.88   0.004    -.3294456   -.0627977
           _cons |   .4644935   .0054163    85.76   0.000     .4538758    .4751111
    -------------+----------------------------------------------------------------
         sigma_u |  .39174852
         sigma_e |  .48458941
             rho |  .39523397   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(2202, 6660) = 3.19                  Prob > F = 0.0000
    2) The second exercise follows the next procedure used in https://www.princeton.edu/~otorres/DID101.pdf:

    Code:
    xtreg lncap_1 time##treated regime, fe
    The output is:

    Code:
    Fixed-effects (within) regression               Number of obs     =      8,865
    Group variable: code_7                          Number of groups  =      2,203
    
    R-sq:                                           Obs per group:
         within  = 0.0720                                         min =          1
         between = 0.0004                                         avg =        4.0
         overall = 0.0191                                         max =         19
    
                                                    F(3,6659)         =     172.15
    corr(u_i, Xb)  = -0.1177                        Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
         lncap_1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          1.time |  -.2273705   .0133374   -17.05   0.000    -.2535161   -.2012248
       1.treated |  -.2152926   .0665886    -3.23   0.001    -.3458275   -.0847576
                 |
    time#treated |
            1 1  |   -.035354   .0691715    -0.51   0.609    -.1709524    .1002443
                 |
          regime |          0  (omitted)
           _cons |    .529124   .0065182    81.18   0.000     .5163463    .5419017
    -------------+----------------------------------------------------------------
         sigma_u |  .39738564
         sigma_e |  .47438456
             rho |  .41235892   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(2202, 6659) = 3.42                  Prob > F = 0.0000

    3) Finally, the last exercise uses the command diff:

    Code:
    diff lncap_1, treated(treated) period(time) id(code_7)
    The output was:

    Code:
    DIFFERENCE-IN-DIFFERENCES ESTIMATION RESULTS
    Number of observations in the DIFF-IN-DIFF: 8865
                Before         After    
       Control: 5924           2407        8331
       Treated: 80             454         534
                6004           2861
    --------------------------------------------------------
     Outcome var.   | lncap_1 | S. Err. |   |t|   |  P>|t|
    ----------------+---------+---------+---------+---------
    Before          |         |         |         |
       Control      | 0.493   |         |         |
       Treated      | 0.214   |         |         |
       Diff (T-C)   | -0.278  | 0.068   | -4.12   | 0.000***
    After           |         |         |         |
       Control      | 0.374   |         |         |
       Treated      | 0.165   |         |         |
       Diff (T-C)   | -0.209  | 0.031   | 6.81    | 0.000***
                    |         |         |         |
    Diff-in-Diff    | 0.069   | 0.074   | 0.93    | 0.352
    --------------------------------------------------------
    R-square:    0.02
    * Means and Standard Errors are estimated by linear regression
    **Inference: *** p<0.01; ** p<0.05; * p<0.1
    As you can see, I obtained three different values for the DiD estimator and I can not understand why it is happening. I wish someone here could help me.
    Attached Files
    Last edited by Mateus Maciel; 31 Mar 2021, 08:44.

  • #2
    Hello Mateus Maciel, I am not a pro in DID, however, I think this could be because you have unbalanced panel.

    Comment


    • #3
      Hello, Joe!

      Well, I thought that xtreg would deal with unbalanced panels. However, I don't know if the command diff can deal with this kind of data.

      Comment


      • #4
        Joe is correct. The two procedures are not equivalent when the panel is unbalanced. There are some additional problems. First, you use lncap_1 when cap_1 is very often zero. You're losing lots of data this way -- over 75% of your data. This is actually a good candidate for fixed effects Poisson regression, but you have to properly compute the treatment indicator.

        Plus, something else seems off because the "regime" variable appears to change over time. With a common intervention time, which I assume is 2007 for all units, the "treated" variable should be an indicator of whether the unit was ever treated. You don't seem to have a variable like that, and I'm not sure what "regime" indicates.

        Comment


        • #5
          Thanks, professor Wooldridge. I was discussing the possibility of working with Poisson regression with my supervisor. Nevertheless, I could not understand your advice concerning my treatment variable. Regime is 0 if the unit was not treated and 1 if it was, what do you mean with "regime variable appears to change over time?".

          Comment


          • #6
            Mateus: That might work, but my understanding is that the -diff- command is expecting the "treatment" variable to not vary over time: it indicates eventual treatment. It didn't give you an error, so maybe not. But in the examples in the help file, it seems that you are not supposed to create the time-varying treatment variable.

            Comment


            • #7
              Now I got your point, professor.

              Thanks again for the help!

              Comment


              • #8
                Hi Mateus! I guess that you cannot use "traditional" DID with your "regime" variable. I do not know but it makes sense to me that it will be necessary to add a proportion measure in your code, in order to analyze the municipalities(?) that adopt or not a regime after and before this event.

                Comment


                • #9
                  Yes, Fernanda Almeida! It is exactly what professor Wooldridge meant, but you really clarified his point and it all makes sense. I will figure out a way to deal with it with my supervisor.

                  Comment

                  Working...
                  X