Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • SGMM - problem with Sargan/Hansen tests

    I have a question regarding SGMM. I have run the regression 'xtabond2 gdpg lgdpg d pop educ loginv, gmm(L2.(lgdpg d pop educ loginv), collapse) twostep robust nodiffsargan orthogonal' on stata to examine the relationship between public debt and economic growth (accounting for population, investment, and education) however, I have 111 instruments and so the Hargan/Sansen tests suggest the model is weakened by many instruments. I am unsure why this is and how to resolve the issue.

  • #2
    It depends on the number of observations. If you have for instance 200 observations, 111 instruments are surely too many. In this case it is quite likely that the sargan test is extremely significant.

    On the side, why did you exclude the sargan in difference? That test is helpful to check the validity of the additional instruments implied by the usage of the SGMM estimator. I would have a look at that test too in order to evaluate whether that estimation is reasonable.

    The other point is that you used all the lags from 2 onward as instruments. That is the reason for the large proliferation of instruments, i suspect.
    Last edited by Dario Maimone Ansaldo Patti; 22 Dec 2020, 12:06.

    Comment


    • #3
      In addition to Dario's helpful comments, given that you are using the orthogonal option, I recommend to have a look at the following topic including the follow-up discussion linked therein:
      https://www.statalist.org/forums/for...d-xtdpdsys-gmm

      You might also find the following presentation slides useful:
      https://www.kripfganz.de/stata/

      Comment


      • #4
        Thank you Dario and Sebastian for your help!

        I have amended the code to 'xtabond2 gdpg lgdpg d pop educ loginv, gmm(lgdpg d pop educ loginv, lag(1 2) collapse) robust orthogonal twostep' in oder to only include the first two lags as instruments and to show the difference in sargan test. I get the following results:

        Sargan test of overid. restrictions: chi2(10) = 495.29 Prob > chi2 = 0.000
        (Not robust, but not weakened by many instruments.)
        Hansen test of overid. restrictions: chi2(10) = 27.08 Prob > chi2 = 0.003
        (Robust, but weakened by many instruments.)

        Difference-in-Hansen tests of exogeneity of instrument subsets:
        GMM instruments for levels
        Hansen test excluding group: chi2(5) = 24.54 Prob > chi2 = 0.000
        Difference (null H = exogenous): chi2(5) = 2.54 Prob > chi2 = 0.771

        However, when I do 'xtabond2 gdpg lgdpg d pop educ loginv, gmm(lgdpg d pop educ loginv, lag(1 5) collapse) robust orthogonal twostep' I get:

        Sargan test of overid. restrictions: chi2(25) = 522.80 Prob > chi2 = 0.000
        (Not robust, but not weakened by many instruments.)
        Hansen test of overid. restrictions: chi2(25) = 32.61 Prob > chi2 = 0.141
        (Robust, but weakened by many instruments.)

        Difference-in-Hansen tests of exogeneity of instrument subsets:
        GMM instruments for levels
        Hansen test excluding group: chi2(20) = 30.18 Prob > chi2 = 0.067
        Difference (null H = exogenous): chi2(5) = 2.44 Prob > chi2 = 0.786

        Does this mean I should include up to 5 lags as the Hansen test is now passed? Which I think is due to the fact that previously I had too few instruments, which weakens the Hansen test (as Roodman (2009) notes)

        Furthermore, I am confused about why the Sargan test statistic remains 0.000?

        Comment


        • #5
          Could you please post the full results you have? Please copy the table and put it under

          Comment


          • #6
            Thank you very much for your continued help, here are the full results from the two regressions:

            xtabond2 gdpg lgdpg d pop educ loginv, gmm(lgdpg d pop educ loginv, lag(1 2) collapse) robust orthogonal twostep
            Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.

            Dynamic panel-data estimation, two-step system GMM
            ------------------------------------------------------------------------------
            Group variable: c_id Number of obs = 621
            Time variable : year Number of groups = 34
            Number of instruments = 16 Obs per group: min = 3
            Wald chi2(5) = 126.40 avg = 18.26
            Prob > chi2 = 0.000 max = 23
            ------------------------------------------------------------------------------
            | Corrected
            gdpg | Coef. Std. Err. z P>|z| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            lgdpg | .0802029 .0784967 1.02 0.307 -.0736478 .2340536
            d | -.0695442 .0194186 -3.58 0.000 -.107604 -.0314843
            pop | -1.922709 1.769285 -1.09 0.277 -5.390443 1.545025
            educ | -.10881 .0920604 -1.18 0.237 -.2892451 .0716251
            loginv | -.6351864 1.852578 -0.34 0.732 -4.266173 2.995801
            _cons | 18.10596 11.38808 1.59 0.112 -4.214269 40.42618
            ------------------------------------------------------------------------------
            Instruments for orthogonal deviations equation
            GMM-type (missing=0, separate instruments for each period unless collapsed)
            L(1/2).(lgdpg d pop educ loginv) collapsed
            Instruments for levels equation
            Standard
            _cons
            GMM-type (missing=0, separate instruments for each period unless collapsed)
            D.(lgdpg d pop educ loginv) collapsed
            ------------------------------------------------------------------------------
            Arellano-Bond test for AR(1) in first differences: z = -4.16 Pr > z = 0.000
            Arellano-Bond test for AR(2) in first differences: z = -1.19 Pr > z = 0.234
            ------------------------------------------------------------------------------
            Sargan test of overid. restrictions: chi2(10) = 495.29 Prob > chi2 = 0.000
            (Not robust, but not weakened by many instruments.)
            Hansen test of overid. restrictions: chi2(10) = 27.08 Prob > chi2 = 0.003
            (Robust, but weakened by many instruments.)

            Difference-in-Hansen tests of exogeneity of instrument subsets:
            GMM instruments for levels
            Hansen test excluding group: chi2(5) = 24.54 Prob > chi2 = 0.000
            Difference (null H = exogenous): chi2(5) = 2.54 Prob > chi2 = 0.771


            . xtabond2 gdpg lgdpg d pop educ loginv, gmm(lgdpg d pop educ loginv, lag(1 5) collapse) robust orthogonal twostep
            Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.

            Dynamic panel-data estimation, two-step system GMM
            ------------------------------------------------------------------------------
            Group variable: c_id Number of obs = 621
            Time variable : year Number of groups = 34
            Number of instruments = 31 Obs per group: min = 3
            Wald chi2(5) = 339.12 avg = 18.26
            Prob > chi2 = 0.000 max = 23
            ------------------------------------------------------------------------------
            | Corrected
            gdpg | Coef. Std. Err. z P>|z| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            lgdpg | .1175239 .0511877 2.30 0.022 .0171979 .2178499
            d | -.0558115 .0132975 -4.20 0.000 -.0818741 -.0297489
            pop | -1.596829 1.096073 -1.46 0.145 -3.745094 .5514353
            educ | -.066075 .0548352 -1.20 0.228 -.17355 .0414
            loginv | .3404805 1.443238 0.24 0.813 -2.488215 3.169176
            _cons | 10.00505 8.009784 1.25 0.212 -5.693837 25.70394
            ------------------------------------------------------------------------------
            Instruments for orthogonal deviations equation
            GMM-type (missing=0, separate instruments for each period unless collapsed)
            L(1/5).(lgdpg d pop educ loginv) collapsed
            Instruments for levels equation
            Standard
            _cons
            GMM-type (missing=0, separate instruments for each period unless collapsed)
            D.(lgdpg d pop educ loginv) collapsed
            ------------------------------------------------------------------------------
            Arellano-Bond test for AR(1) in first differences: z = -4.27 Pr > z = 0.000
            Arellano-Bond test for AR(2) in first differences: z = -1.07 Pr > z = 0.283
            ------------------------------------------------------------------------------
            Sargan test of overid. restrictions: chi2(25) = 522.80 Prob > chi2 = 0.000
            (Not robust, but not weakened by many instruments.)
            Hansen test of overid. restrictions: chi2(25) = 32.61 Prob > chi2 = 0.141
            (Robust, but weakened by many instruments.)

            Difference-in-Hansen tests of exogeneity of instrument subsets:
            GMM instruments for levels
            Hansen test excluding group: chi2(20) = 30.18 Prob > chi2 = 0.067
            Difference (null H = exogenous): chi2(5) = 2.44 Prob > chi2 = 0.786


            .

            Comment


            • #7
              Please use
              Code:
              your results
              in this way your results van be read much better than this.

              From what I see from your post, you are estimating a growth equation. If I am correct, it is rather strange that every variable is not significant, a part from d. By the way what d stands for?

              Comment


              • #8

                Apologies, d stands for public debt as I am investigating the effects of public debt on economic growth in 35 OECD countries between 1995 and 2019.

                Furthermore, I am having issues with that code, it allows me to paste the results clearly, however, as soon as I post it goes back to the layout above.
                Last edited by Eliot Wilson; 23 Dec 2020, 17:55.

                Comment


                • #9
                  Ok. Never mind. A couple of things:

                  1) Why did you not include time dummies in your model? I think they could be able to capture a period shock that may affect the economies in your sample. When using time dummies, remember to include them under the option iv( ).

                  2) In your estimation you assume that ALL the variables are endogenous and ALL of them should be used as instruments. Recall, that you can simply use few variables as instruments. It is not necessary to use all of them.

                  3) The second estimation is more convincing in terms of the diagnostic tests. Regarding the point estimates, it is somehow surprising that human capital (educ) contribute negatively (although not significant) to economic growth. In addition, investments are not significant as well as population (although displaying the correct sign). It seems that all the standard growth regressors do not contribute to economic development in your estimation. Indeed, this is in my view surprising and concerning.

                  4) Have you tried standard approaches first, such as pooled OLS and/or F.E. panel estimator. I know that they can produce biased estimates (due to the presence of the lagged dependent variable and the potential endogeneity of some or all regressors), but they could offer you an initial view about your model. I would go for them first and then you could move to more sophisticated models.

                  Comment


                  • #10
                    Thanks Dario, these comments have been really helpful.

                    Adding the time dummies made all my variables (explanatory & control) insignificant so I choose not to use them.

                    Adding the control variables (education, population & investment) into the iv() has significantly improved my model, and now investment and population are significant. The Wald and AB test suggest no problems and the Hansen test is now just above 0.025 but below 0.05, which I believe is a slight improvement. Education remains negative but insignificant, although I think this may be down to the fact the OECD database only provides data on secondary education and not primary, and so the data isn't as reliable?

                    Furthermore when I add additional controls such as inflation and trade openness, investment becomes insignificant, I wondered if this could be down to slight over specification? or perhaps another reason?

                    Comment

                    Working...
                    X