Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Xtabond2 Estimation: Hansen Test Significant Even If All Are Endogenous

    Hello Statalist!

    I have been trying to use xtabond2 to make a sensitivity analysis for my original FE/RE Hybrid (Allison 2009) estimations, and I use my time-invarying variables in the model.

    My data has three waves, but with gaps (unbalanced), and the N is approximately 2500. All the variables below are statistically significant in the FE/RE estimations. I have read Roodman (2009) and Kiviat (2019) to be able to come up with a good specification, especially how to treat these variables: endogenous, exogenous or predetermined.

    Kiviat (2019) provides a 10-step sequential specification, and I wanted to follow it. First, I specified all my variables as endogenous.

    The code is as follows (I changed the original variable names to make it easy to read):

    Code:
     xtabond2 DV L.DV X1 X2 X3 X4 X5 X6 X7 wave2 wave3, gmm(DV, lag(2 .)) gmm(X1, lag(2 .)) gmm(X2, lag(2 .)) gmm(X3, lag(2 .)) gmm(X4, lag(2 .)) gmm(X5, lag(2 .)) gmm(X6, lag(2 .)) gmm(X7, lag(2 .)) iv(wave2 wave3, eq(level)) twostep robust small orthogonal
    Yet, the Stata output shows that Hansen tests, as well as certain incremental Hansen tests below, are statistically significant:

    Image 1: The Estimation Output
    Click image for larger version

Name:	1.PNG
Views:	1
Size:	6.2 KB
ID:	1534091


    Image 2: Difference-in-Hansen Tests of Exogeneity of Instrument Subsets
    Click image for larger version

Name:	2.PNG
Views:	1
Size:	10.0 KB
ID:	1534092


    Given that I do not have any other lags, is there any possibility that I can do Dynamic GMM? Am I wrong in my specification in some way?

    I hope this post is in line with the general rules, and thank you very much for your help!

  • #2
    Sebastian Kripfganz I especially would like to hear from you, Sebastian!

    Comment


    • #3
      An addition:

      How should I specify my code? Like in the first post:

      Code:
       
        xtabond2 DV L.DV X1 X2 X3 X4 X5 X6 X7 wave2 wave3, gmm(DV, lag(2 .)) gmm(X1, lag(2 .)) gmm(X2, lag(2 .)) gmm(X3, lag(2 .)) gmm(X4, lag(2 .)) gmm(X5, lag(2 .)) gmm(X6, lag(2 .)) gmm(X7, lag(2 .)) iv(wave2 wave3, eq(level)) twostep robust small orthogonal
      Or, as follows:

      Code:
       
       xtabond2 DV L.DV X1 L.X1 X2 L.X2 X3 L.X3 X4 L.X4 X5 L.X5 X6 L.X6 X7 L.X7 wave2 wave3, gmm(DV, lag(2 .)) gmm(X1, lag(2 .)) gmm(X2, lag(2 .)) gmm(X3, lag(2 .)) gmm(X4, lag(2 .)) gmm(X5, lag(2 .)) gmm(X6, lag(2 .)) gmm(X7, lag(2 .)) iv(wave2 wave3, eq(level)) twostep robust small orthogonal

      Comment


      • #4
        Since you only have 3 waves, there is only 1 instrument available per endogenous variable for the first-differenced (or forward-orthogonally transformed) model and 1 more instrument per variable for the level model. Without the extra instruments for the level model, your econometric model would be just identified. That implies that the overall Hansen overidentification test can be interpreted as a test for the validity of the extra instruments for the level model, assuming that all of the assumptions needed for the difference-GMM estimator are satisfied (in particular: no serial error correlation). Your test result therefore suggests that the additionally required mean stationarity assumption for the level model may not be satisfied. The logical consequence would be to only use the instruments from the endogenous variables for the first-differenced model, i.e. to add the suboption eq(diff) to the gmm() options.

        Furthermore, there is a severe bug in xtabond2 that produces incorrect estimates with the orthogonal option. See slides 70 and 71 of my recent 2019 London Stata Conference presentation. I recommend to use the xtdpdgmm command instead.
        Given the above comments, if you only use instruments for the transformed model, then there are no longer enough instruments available to estimate a model that also includes all the lags of the independent variables. While it may generally be desirable to allow for richer dynamics by including them, you would generally need more waves or need to make stronger (in parts untestable) assumptions, such as treating those regressors as predetermined or strictly exogenous.
        https://twitter.com/Kripfganz

        Comment


        • #5
          Thank you very much Sebastian for this great comment! I really appreciate it.

          Comment


          • #6
            One quick question: after I have reorganized the model by adding the suboptions eq(diff) to the gmm options, all the Hansen tests now are above 0.25, but my lagged dependent variables lose their signifiance in all my three models, and the sign of the coefficient of my key independent variables change from positive to negative.

            My questions are:
            (a) Should I worry about the insignificance of LDVs,
            (b) Should I report both level and difference models, even though the former's assumptions are violated,
            (c) Should I use eq(diff) option in my iv specifications as well along with the gmm options.

            Thanks a lot!

            Comment


            • #7
              (a) This might indicate that a dynamic model is not necessary at all, in which case you might be better of simply estimating a static model.
              (b) That depends on the best practice in your research field. You can report the system-GMM estimates if you want to highlight that the level instruments are not valid.
              (c) You are using the iv() option only for the wave/time dummies. They could be specified either for the level model or the transformed model. This is mainly up to you. Asymptotically, both choices are equivalent. (In finite samples, estimates may differ.)
              https://twitter.com/Kripfganz

              Comment


              • #8
                Thank you very much, again! These are extremely helpful for me!

                Comment

                Working...
                X