Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in Difference (DiD) within GMM

    Hello everyone,

    I am trying to work out how to complete a DiD (difference in difference) analysis with Difference GMM and System GMM using Panel Data. Because my T is small and I am working with a cluster structure, I am using xtabond2 for the estimates.

    My data set contains 8,232 students in a Panel Data format with T=5. For each student, I have the test scores (depvar) and a list of observed variables over the time period (indepvar). Morover, I have time, school and class fixed effects.

    During the time series (2003-2008), a policy change is implemented in state schools in year 2007. Then, students from state schools are my treatment group and students from municipal schools are the control group. My DiD is 1 if student is enrolled in state schools (treated) in post-treatment period (time).

    In the model I assume L1.profic_mat as endogenous, the control variables as predetermined and the fixed effects and DiD as exogenous. Then for the system GMM I estimate the following model: (PS: Coefficients for control variables and fixed effects are not show to save space).
    Code:
    xi: xtabond2 L(0/1).profic_mat DiD time treated $controlvar i.wave i.IDescola i.IDturma, ///
    gmm(L1.profic_mat,lag(1 1)) ///
    gmmstyle($controlvar) ///
    iv(DiD time treated i.wave i.IDescola i.IDturma, equation(level)) ///
    cluster(IDescola) twostep small orthogonal
    i.wave            _Iwave_1-5          (naturally coded; _Iwave_1 omitted)
    i.IDescola        _IIDescola_35018348-35924957(naturally coded; _IIDescola_35018348 omitted)
    i.IDturma         _IIDturma_269-3809  (naturally coded; _IIDturma_269 omitted)
    Favoring speed over space. To switch, type or click on mata: mata set matafavor space, perm.
    Warning: Two-step estimated covariance matrix of moments is singular.
      Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.
      Difference-in-Sargan/Hansen statistics may be negative.
    
    Dynamic panel-data estimation, two-step system GMM
    ------------------------------------------------------------------------------
    Group variable: IDaluno                         Number of obs      =      4056
    Time variable : wave                            Number of groups   =      1713
    Number of instruments = 755                     Obs per group: min =         1
    F(644, 31)    =  49235.39                                      avg =      2.37
    Prob > F      =     0.000                                      max =         4
                                          (Std. Err. adjusted for clustering on IDescola)
    -------------------------------------------------------------------------------------
                        |              Corrected
             profic_mat |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    --------------------+----------------------------------------------------------------
             profic_mat |
                    L1. |   .3678468   .0810637     4.54   0.000     .2025164    .5331773
                        |
                    DiD |   160.1348   57.99518     2.76   0.010     41.85286    278.4168
                   time |          0  (omitted)
                treated |  -82.91613    87.8292    -0.94   0.352     -262.045     96.2127

    For the difference GMM, I have:

    Code:
    xi: xtabond2 L(0/1).profic_mat DiD time treated $controlvar i.wave i.IDescola i.IDturma, ///
    gmm(L1.profic_mat,lag(1 1)) ///
    gmmstyle($controlvar) ///
    iv(DiD time treated i.wave i.IDescola i.IDturma, equation(level)) ///
    cluster(IDescola) twostep small orthogonal noleveleq

    However, I am still unsure whether this specification is right, because the values of the DiD coefficients for System and Difference GMM are very different between themselves. When I estimate the model with FE (with no lagged variable) the result is also very different.

    For this reason the question: Is my specification of DiD in this GMM right?
    I am not sure, whether the DiD in this case will work exactly in the same way as in a linear model. I need help with the implentation of DiD in this GMM and with the interpretation of its coefficient.

    I am thankful for all help and Information.

  • #2
    It is still a linear model, so the usual DiD procedure should be applicable. That there are observed differences between difference and system GMM may not be specific to the DiD design. It could simply be that the extra assumptions for system GMM are violated.

    Also notice that xtabond2 has severe bugs, in particular when used with forward-orthogonal deviations. Please see slide 81 (and the preceeding slides) of my recent London Stata Conference presentation:
    Further discussion on dynamic panel model GMM estimation:
    https://www.kripfganz.de/stata/

    Comment


    • #3
      First of all: Thanks for your reply.

      Could you please give me more information about these "extra assumptions"?

      To the best of my knowledge, the conventional tests for the system GMM are 1) testing for instrument validity and 2) test for second order serial autocorrelation.
      Are you trying to say me, that there are pre-estimation tests that may be relevant only for the system GMM? (i.e normality, heteroskedasticity, panel unit root tests, panel cointegration test)
      As far as I can remember, pre-tests generally are not necessary if the the GMM dynamic panel has large N and small T (my case). According to Roodman (2009), with very small T unit root testing would not typically be applied.


      Possibly, I am having the problem because I am not using correctly the routine with the fixed effects (i.wave i.IDescola i.IDturma). Could you please give me some suggestions?

      Comment


      • #4
        The extra assumptions are that the changes in the initial observations are uncorrelated with the unit-specific effects. A sufficient condition is joint mean stationarity of all variables. There is no separate stationarity test needed. You could test this assumption with a (difference-in-)Hansen test for the (additional) moment conditions.

        Using dummy variables with the factor variable notation can also lead to incorrect results due to another bug in xtabond2 that does not compute the degrees of freedom for the overidentification tests correctly in this case.
        https://www.kripfganz.de/stata/

        Comment

        Working...
        X