Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do you correct for AR(2) in 2-Step Difference GMM?

    Dear all,
    Using xtabond2 in Stata13, I have a dynamic panel model and I'm testing for crime persistence, T=13 and n=134. My variables are: dep var: log(homicide); explanatory vars: lag of log(homicide), log(Gini), log(GDPpc), unemployment rate, male 15-24yrs, pry educ, sec educ, rule of law, corruption and death penalty - while the lag of log(homicide) is endogenous and the others explanatory variables are weakly exogenous.

    Having read both Roodman's papers on GMM specifications and over-identifying instruments, I used this command:

    xtabond2 lnhom l.lnhom lngini lngdppc m1524 xune_m rol corrupt dtpen pry_educ sec_educ yr3-yr13, gmm(l3.lnhom) iv(m1524 lngini lngdppc xune_m rol corrupt pry_educ sec_educ yr3-yr13) nodiffsargan noleveleq twostep robust orthogonal small

    ....and the results are: N=1463, n=134, instruments=66, lags = 3, AR(1) = 0.009, AR(2)=0.068, Hansen=(0.101).

    Attempts to classify some variables as endogenous turned out worst results with the Hansen test reaching the 'unacceptable' 1.000 mark, so I had to just classify them as 'weakly exogenous'.

    Is it ok to accept this result and justify it by saying that: 'I cannot reject AR(2) at 5% significance level' or is there a way of correcting for AR(2)?

    I have attached the Stata output and will greatly appreciate all contributions.
    Ngozi
    Attached Files

  • #2
    Ngozi: You are misinterpreting the result from the AR(2) test. The null hypothesis is: "NO autocorrelation of order 2". Because the p-value is 0.068 > 0.05, you cannot reject the null hypothesis of no autocorrelation at the 5% significance level.
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Sebastian,

      thanks for your response but I feel we're on the same page....saying the same thing in different ways ....right? So, now that you have clarified this, is it ok to accept this result ?

      Comment


      • #4
        Since there is no evidence (at the 5% significance level) for second-order serial correlation of the error term, there is no need to correct for it and you could use lagged levels of the dependent variable from lag 2 onwards as instruments for the transformed equation. This would be gmm(L.lnhom) or gmm(lnhom, lag(2 .)), both of which are equivalent because xtabond2 by default lags the specified variable once for GMM-style instruments. Why did you choose gmm(L3.lnhom)?
        https://www.kripfganz.de/stata/

        Comment


        • #5
          I disagree with Sebastian. I wouldn't take much comfort in a p value of 0.068. This is saying that if there is no second-order serial correlation in differences--removing a threat to the validity of twice lagged instruments--there's only a 6.8% chance you'd get a Arellano-Bond z statistic that large. So you are right to use deeper lags--that's how you work around the bad AR() test result. But you should also include an artests(3) option to check for longer serial correlation in the same way.

          Comment


          • #6
            I agree that the AR(2) result is anything but comfortable, but the "degree of comfort" is given by the confidence level which has to be chosen by the researcher. By using deeper lags there is a trade-off between the validity of the instruments (robustness to serial correlation) and the strength of the instruments (decreasing correlation with larger distance in time). There is no black and white here, and it might be good to report the results of different specifications in applied work.
            https://www.kripfganz.de/stata/

            Comment


            • #7
              Hi Sebastian/David,

              Following your guidelines, I did the test again using gmm(lnhom, lag (2 .)) and adding artests(4) to the earlier options used. Using 86 instruments, I got the following results:

              Arellano-Bond test for AR(1) in first differences: z = -3.02 Pr > z = 0.002
              Arellano-Bond test for AR(2) in first differences: z = 2.06 Pr > z = 0.040
              Arellano-Bond test for AR(3) in first differences: z = -2.22 Pr > z = 0.027
              Arellano-Bond test for AR(4) in first differences: z = 1.37 Pr > z = 0.170
              ------------------------------------------------------------------------------
              Sargan test of overid. restrictions: chi2(65) = 214.77 Prob > chi2 = 0.000
              (Not robust, but not weakened by many instruments.)
              Hansen test of overid. restrictions: chi2(65) = 72.99 Prob > chi2 = 0.232
              (Robust, but weakened by many instruments.)

              The truth is this outcome is okay as the point estimate on the lagged dependent variable is within the range (between the FE and OLS point estimates) and the standard error is just appropriate. Is it okay to report just the AR(4) and Hansen statistic alone?

              Thanks a lot for helping!
              Ngozi

              Comment


              • #8
                With these results, using gmm(lnhom, lag(2 .)) is indeed no longer valid because you cannot reject serial correlation up to the third order. I would thus indeed suggest to follow David's advise and use only lags starting at depth 4, that is gmm(lnhom, lag(4 .)), or gmm(L3.lnhom) as you specified in your initial post. But maybe the "truth" lies in between. I would suggest to try also gmm(lnhom, lag(3 .)) and to have a look if the p-values for the AR(3) and AR(4) tests are in this case "comfortable".

                An alternative and probably more elegant solution than playing around with the instruments might be to add an additional lag of the dependent variable and maybe also the exogenous regressors to the estimation equation, which is the common way in the time series literature to account for the serial correlation.
                https://www.kripfganz.de/stata/

                Comment


                • #9
                  Thanks Sebastian, I actually tried gmm(lnhom, lag(3 .)) and gmm(lnhom, lag(4 .)) the results weren't any better - both AR(2) and Hansen test were below 0.10. But I will explore all the various specifications given.

                  Once again thanks for taking the time to teach. I appreciate.

                  Comment


                  • #10
                    I have a similar issue. The AR(2) is statistically significant (p = 0.000). In this case, do I have to control for the second lag of the dependent variable, or do I just specify gmm(y, lag(2 .))? When I used the second lag of the dependent variable as a regressor, the coefficient of the second lag is statistically significant (p = 0.00), but AR(2) is not (p >0.5). I am using system GMM. Any help will be greatly appreciated.
                    Last edited by Chandan Jha; 26 Jul 2015, 13:11.

                    Comment


                    • #11
                      Hi all,
                      Could you please help me with these issues:

                      (1) when I try deeper lags for gmm-style IVs, the AR(n) test(s) are passed (say: we do not reject the null that there is no serial correlation), at lag(3): Hansen Test p value >0.05 --> we do not reject the validity of joint-IVs, but at lag(4), Hansen test p value <0.05 --> we reject the validity of joint_IVs. It happens the same with lag(5) : not reject null of Hansen Test, and lag(6) reject null.
                      --> How can we decide which lags we should choose in this case?

                      (2)What is the optimal lags length? As when we increase the lags length, we weaken the Hansen test (missing obs) as explained by Roodman (2009). Is using collapse, say: gmmstyle (X Y, collapse), is the best way? Could you explain for me clearer?



                      Thanks a lot in advance!

                      Comment


                      • #12
                        Dear all!
                        I'm Hoang Luong, PhD student at The University of Greenwich, London.
                        I have a problem with AR(2) too. But I don't really understand the correction process here. So do I have to use deeper lags for all the endogenous variables, including control variables. Because in my case, if I use from lag 3 for only my main variables, the results are acceptable. But if I use for all including my control variables, it is impossible to have the significant results (please see the attached file below).

                        Also, the second question is that if I have the square term of an endogenous variable, should I put it in gmmstyle also, or should I treat it as an exogenous variable, cause all the effect of error terms can be instrumented to the own endogenous variable and we don't need to take care of the square term.
                        I'm looking forward to any suggestion or guide here.
                        Thank you in advance for reading.

                        Best regards!
                        Hoang Luong

                        Attached Files

                        Comment


                        • #13
                          Please I have run system gmm and AR(2) is giving me dot. Any problem? How do I solve it if any?

                          Comment


                          • #14
                            Dear all
                            Group variable: id_e Number of obs = 185
                            Time variable : year Number of groups = 37
                            Number of instruments = 99 Obs per group: min = 1
                            Wald chi2(6) = 80062.48 avg = 5.00
                            Prob > chi2 = 0.000 max = 7

                            Arellano-Bond test for AR(1) in first differences: z = 0.77 Pr > z = 0.440
                            Arellano-Bond test for AR(2) in first differences: z = -1.14 Pr > z = 0.256

                            can someone tell me how to reduce AR(1) ?


                            Comment


                            • #15
                              Dear Sebastian,

                              I have a similar problem. In my model I found AR(2) p value is 0.0685. However, AR(3) and AR(4) produce good results. If I follow your quote below (use only lags at depth 3) all AR() stats looks great. All results stay same except the lag of dependent variable which became insignificant.

                              Code:
                              Arellano-Bond test for autocorrelation of the first-differenced residuals
                              H0: no autocorrelation of order 1:     z =  -13.5887   Prob > |z|  =    0.0000
                              H0: no autocorrelation of order 2:     z =    1.8217   Prob > |z|  =    0.0685
                              H0: no autocorrelation of order 3:     z =   -0.7416   Prob > |z|  =    0.4584
                              H0: no autocorrelation of order 4:     z =    0.4485   Prob > |z|  =    0.6538

                              Originally posted by Sebastian Kripfganz View Post
                              use only lags starting at depth 4, that is gmm(lnhom, lag(4 .)), or gmm(L3.lnhom) as you specified in your initial post. But maybe the "truth" lies in between. I would suggest to try also gmm(lnhom, lag(3 .)) and to have a look if the p-values for the AR(3) and AR(4) tests are in this case "comfortable".
                              While results are changing in different specifications, is it safe to use lags at depth 2?

                              Best regards,
                              Nursena

                              Comment

                              Working...
                              X