Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

    Dear Statalisters,

    I have made a new estimation command available for installation from my website:
    Code:
    . net install xtdpdgmm, from(http://www.kripfganz.de/stata/)
    xtdpdgmm estimates a linear (dynamic) panel data model with the generalized method of moments (GMM). The main value added of the new command is that is allows to combine the traditional linear moment conditions with the nonlinear moment conditions suggested by Ahn and Schmidt (1995) under the assumption of serially uncorrelated idiosyncratic errors. These additional nonlinear moment conditions can yield potentially sizeable efficiency gains and they also improve the finite-sample performance. Given that absence of serial correlation is usually a prerequisite also for other GMM estimators in the presence of a lagged dependent variable, the gains from the nonlinear moment conditions essentially come for free.

    The extra moment conditions can help to overcome a weak instruments problem of the Arellano and Bond (1991) difference-GMM estimator when the autoregressive coefficient approaches unity. Furthermore, the Ahn and Schmidt (1995) estimator is also robust to deviations from mean stationarity, a situation that would invalidate the Blundell and Bond (1998) system-GMM approach.

    Without these nonlinear moment conditions, xtdpdgmm replicates the results obtained with the familiar commands xtabond, xtdpd, xtdpdsys, and xtabond2, as well as my other recent command xtseqreg. Collapsing of GMM-type instruments and different initial weighting matrices are supported. The key option of xtdpdgmm that adds the nonlinear moment conditions is called noserial. For example:
    Code:
    . webuse abdata
    
    . xtdpdgmm L(0/1).n w k, noserial gmmiv(L.n, collapse model(difference)) iv(w k, difference model(difference)) twostep vce(robust)
    
    Generalized method of moments estimation
    
    Step 1
    initial:       f(p) =  6.9508498
    alternative:   f(p) =   1.917675
    rescale:       f(p) =  .07590133
    Iteration 0:   f(p) =  .07590133  
    Iteration 1:   f(p) =    .003352  
    Iteration 2:   f(p) =  .00274414  
    Iteration 3:   f(p) =  .00274388  
    Iteration 4:   f(p) =  .00274388  
    
    Step 2
    Iteration 0:   f(p) =  .26774896  
    Iteration 1:   f(p) =  .20397319  
    Iteration 2:   f(p) =   .2011295  
    Iteration 3:   f(p) =  .20109259  
    Iteration 4:   f(p) =  .20109124  
    Iteration 5:   f(p) =   .2010912  
    
    Group variable: id                           Number of obs         =       891
    Time variable: year                          Number of groups      =       140
    
    Moment conditions:     linear =      10      Obs per group:    min =         6
                        nonlinear =       6                        avg =  6.364286
                            total =      16                        max =         8
    
                                         (Std. Err. adjusted for clustering on id)
    ------------------------------------------------------------------------------
                 |              WC-Robust
               n |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               n |
             L1. |    .657292   .1381388     4.76   0.000     .3865449    .9280391
                 |
               w |  -.7248798   .0996565    -7.27   0.000    -.9202029   -.5295568
               k |   .2399022   .0737048     3.25   0.001     .0954435    .3843609
           _cons |   2.719216   .4015915     6.77   0.000     1.932111    3.506321
    ------------------------------------------------------------------------------
    The Gauss-Newton technique is used to minimize the GMM criterion function. With vce(robust), the Windmeijer (2005) finite-sample standard error correction is computed for estimators with and without nonlinear moment conditions.

    For details about the syntax, the available options, and the supported postestimation commands, please see the help files:
    Code:
    . help xtdpdgmm
    . help xtdpdgmm postestimation
    Available postestimation command include the Arellano-Bond test for absence of serial correlation in the first-differenced errors, estat serial, and the familiar Hansen J-test of the overidentifying restrictions, estat overid. The results of the Arellano-Bond test differ slightly from xtdpd and xtabond2 for two-step robust estimators because I account for the finite-sample Windmeijer (2005) correction when computing the test statistic, while the existing commands do not. estat overid can also be used to perform difference-in-Hansen tests but it requires that the two models are estimated separately. In that regard, the results differ from the difference-in-Hansen test statistics reported by xtabond2; see footnote 24 in Roodman (2009) for an explanation. An alternative to difference-in-Hansen tests is a generalized Hausman test, implemented in estat hausman for use after xtdpdgmm.

    Finally, the results with and without nonlinear moment conditions can in principle also be obtained with Stata's official gmm command. However, it is anything but straightforward to do so. While the official gmm command offers lots of extra flexibility, it does not provide a tailored solution for this particular estimation problem. While xtdpdgmm can easily handle unbalanced panel data, gmm tends to have some problems in that case. In addition, gmm tends to be very slow in particular with large data sets. I did not do a sophisticated benchmark comparison, but for a single estimation on a data set with 40,000 observations, it took me 43 minutes (!) to obtain the results with gmm, while xtdpdgmm returned the identical results after just 4 seconds!

    I hope you enjoy the new command. As always, comments and suggestions are highly welcome, and an appropriate reference would be very much appreciated if my command proves to be helpful for your own research.

    References:
    • Ahn, S. C., and P. Schmidt (1995). Efficient estimation of models for dynamic panel data. Journal of Econometrics 68: 5-27.
    • Arellano, M., and S. R. Bond (1991). Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. Review of Economic Studies 58: 277-297.
    • Blundell, R., and S. R. Bond (1998). Initial conditions and moment restrictions in dynamic panel data models. Review of Economic Studies 87: 115-143.
    • Roodman, D. (2009). How to do xtabond2: An introduction to difference and system GMM in Stata. Stata Journal 9: 86-136.
    • Windmeijer, F. (2005). A finite sample correction for the variance of linear efficient two-step GMM estimators. Journal of Econometrics 126: 25-51.
    Last edited by Sebastian Kripfganz; 01 Jun 2017, 07:15.
    https://twitter.com/Kripfganz

  • #2
    Thanks to Kit Baum, the latest version 1.0.1 of the xtdpdgmm command is now also available for installation from SSC:
    Code:
    ssc install xtdpdgmm
    The package on SSC will be updated less frequently but it can be an alternative if the installation from my own website fails (which could happen due to institutional firewall restrictions etc.).
    https://twitter.com/Kripfganz

    Comment


    • #3
      Hi Sebastin
      Kindly help me out. Briefly or as the case may be explain to me how to run GMM. I'm a beginner and my research is to be estimated using GMM and I don't know the commands let alone starts estimating. In addition, my variables of interest are dichotomous and continues. The use of GMM is as a result of 2 Dvs that lead to endogeneity. I have two DVs. Thank you

      Comment


      • #4
        My command xtdpdgmm is not suitable for the use with two dependent variables. You might have to use the official gmm command. See help gmm for details about the syntax and some examples.
        https://twitter.com/Kripfganz

        Comment


        • #5
          Dear Sebastian,

          I had a look at the routine you wrote. As far as I understood, it shoule be able to replicate the estimations done with other routines available for dynamic panel data. I have the following routine:

          Code:
          xi:xtabond2 wlnyw l.wlnyw pc pc2 i.time, gmm(wlnyw pc pc2 lns lnnda2, lag(2 4)) twos cluster(id) nol
          I am trying to replicate it using the following code:

          Code:
          xi:xtdpdgmm wlnyw l.wlnyw pc pc2 i.time, gmmiv(wlnyw pc pc2 lns lnnda2, m(d) lag(2 4)) twos vce(robust)
          Stata runs 16,000 iterations and finally declares that convergence is not achieved. Clearly, results are totally different.

          Possibly, I am not using correctly the routine. Could you please give me some suggestions?

          Dario
          Last edited by Dario Maimone Ansaldo Patti; 15 Oct 2017, 12:38.

          Comment


          • #6
            Dear Dario,
            Yes, the xtdpdgmm command should be able to replicate results from xtabond2 (and others). In principle, your syntax is correct but there is a problem with the time dummies. xtdpdgmm currently does not detect the collinearity between the time dummies and thus tries to estimate a model with one time dummy too many. The GMM criterion function thus does not have a unique minimum and the iterative optimization algorithm never converges. I will need to improve the detection of collinear variables in a later version. Thanks for flagging this issue.

            In the meantime, the way around would be to generate new variables for your time dummies first, e.g.
            Code:
            xi i.time
            and then specify only those time dummies in the regression that would not have to be omitted.

            By the way, you should also specify the time dummies as standard instruments.
            https://twitter.com/Kripfganz

            Comment


            • #7
              Dear Sebastian

              Thanks for ​your reply. So if I understood correctly, i should generate time dummies separately and then add them dropping for instance one of them to avoid collinearity. Am i right?

              Comment


              • #8
                Yes, that is correct. xi will drop already one dummy for the base year. You need to drop another one manually because of the lagged dependent variable that reduces the time horizon in the estimation sample by one.
                https://twitter.com/Kripfganz

                Comment


                • #9
                  Hi Sebastian,

                  thanks for your help. I tried the way you suggested and it worked fine.

                  Dario

                  Comment


                  • #10
                    Hi Sebastian,

                    the very last thing. I estimated my model using xtabond2:

                    Code:
                    xi:xtabond2 wlnyw l.wlnyw pc pc2 i.time, gmm(wlnyw pc pc2 lns lnnda2, lag(2 4)) twos r nol
                    Then I estimated the same model using the BB estimator:

                    Code:
                    xi: xtabond2 wlnyw l.wlnyw pc pc2  i.time, gmm(wlnyw pc pc2 lns lnnda2, lag(2 4) eq(diff)) twos r gmm(wlnyw lns, lag (3 3) eq(lev) collapse) nocon
                    I replicate the commands above using:

                    Code:
                    xtdpdgmm wlnyw l.wlnyw pc pc2 time3-time7, gmmiv(wlnyw pc pc2 lns lnnda2, m(d) lag(2 4)) twos vce(robust)
                    and

                    Code:
                    xtdpdgmm wlnyw l.wlnyw pc pc2 time3-time7, gmmiv(wlnyw pc pc2 lns lnnda2, m(d) lag(2 4)) twos vce(robust) gmmiv(wlnyw lns, lag (3 3) m(l) collapse)
                    The autoregressive parameter appears to be different. Also the estimates for pc and pc2 are different. Interestingly, I obtain an autoregressive parameter, which is similar, if in the xtabond2 I use the option passthru when I specify instruments in level, i.e. gmm(wlnyw lns, lag (3 3) eq(lev) passthru). Nonetheless, the point estimates of the other parameters remain different. Hansen test is again similar, but difference-in-Hansen test is totally different. Specifically, after using xtdpdgmm it appears to be negative.

                    Any reason for that?

                    Thanks again,

                    Dario
                    Last edited by Dario Maimone Ansaldo Patti; 16 Oct 2017, 08:20.

                    Comment


                    • #11
                      xtabond2 automatically first-differences the instruments for the level equation. xtdpdgmm does not do this automatically. You need to specify the suboption difference explicitly, i.e. gmmiv(wlnyw lns, lag(3 3) m(l) collapse difference). I have decided to implement it this way because it is closer to the WYTIWYG concept ("what you type is what you get").

                      Yes, the Difference-in-Hansen tests differ. They are computed in a different way. Please see my comment in the opening post #1:
                      Originally posted by Sebastian Kripfganz View Post
                      estat overid can also be used to perform difference-in-Hansen tests but it requires that the two models are estimated separately. In that regard, the results differ from the difference-in-Hansen test statistics reported by xtabond2; see footnote 24 in Roodman (2009) for an explanation.
                      xtabond2 computes the Difference-in-Hansen test without re-estimating the smaller model. This has the advantage that the resulting test statistic is guaranted to be non-negative but it is not the actual difference of the Hansen test statistics (in finite samples). xtdpdgmm computes this test statistic by calculating the actual difference between the two estimated models. In finite samples, this statistic can become negative (the same way as the traditional Hausman test statistic can become negative in finite samples). Asymptotically, the two procedures are equivalent.
                      A negative test statistic might indicate that the excluded instruments account for a substantial fraction of the variation (without saying anything about their validity).
                      https://twitter.com/Kripfganz

                      Comment


                      • #12
                        Thanks Sebastian. Very helpful explanation.

                        Comment


                        • #13
                          Hello,

                          May I ask what if the number of instruments reported is significantly large, i.e. 107? while having only 15 countries sin my panel and 28 years?
                          Is there a way to go around it?

                          Comment


                          • #14
                            The gmmiv() option of xtdpdgmm has the suboptions lagrange() and collapse. Both can be used to reduce the number of instruments. Please see the help file for details.

                            Your data set with a very small cross-sectional dimension and a larger time dimension is not ideal for this kind of GMM estimators that are designed for small-T, large-N situations. If you nevertheless want to use it, I recommend not to use the noserial option in this setting.
                            https://twitter.com/Kripfganz

                            Comment


                            • #15
                              Dear Kripfganz, I am presently using the xtdpdgmm command. my data set fits small-T, large-N situations I initially used the xtabond2 command but I could only pass the hansenj overiddentification test with further lags which weakens my instrument set. Presently I decided to use the following command:

                              Code:
                              xtdpdgmm roa l.roa indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw  lage d2006-d2015 d3-d8 , noserial gmmiv(roa indp cduality bdiversity1 lbsize acindp acfexpr ncindp ccindp lshare  size leverage rdsales cexp netsalesgrw ,lag(3 4) collapse difference) iv(d2006-d2015 d3-d8  lage) twostep vce(robust)
                              My question is: do I have to run a different regression before I can use
                              Code:
                              estat overid
                              with this new command?

                              Regards,

                              ​​​​​​​Albert

                              Comment

                              Working...
                              X