How do you correct for AR(2) in 2-Step Difference GMM?

Ngozi ADELEYE

Join Date: Apr 2014

Posts: 80
#1

How do you correct for AR(2) in 2-Step Difference GMM?

16 Aug 2014, 22:38

Dear all,
Using xtabond2 in Stata13, I have a dynamic panel model and I'm testing for crime persistence, T=13 and n=134. My variables are: dep var: log(homicide); explanatory vars: lag of log(homicide), log(Gini), log(GDPpc), unemployment rate, male 15-24yrs, pry educ, sec educ, rule of law, corruption and death penalty - while the lag of log(homicide) is endogenous and the others explanatory variables are weakly exogenous.

Having read both Roodman's papers on GMM specifications and over-identifying instruments, I used this command:

xtabond2 lnhom l.lnhom lngini lngdppc m1524 xune_m rol corrupt dtpen pry_educ sec_educ yr3-yr13, gmm(l3.lnhom) iv(m1524 lngini lngdppc xune_m rol corrupt pry_educ sec_educ yr3-yr13) nodiffsargan noleveleq twostep robust orthogonal small

....and the results are: N=1463, n=134, instruments=66, lags = 3, AR(1) = 0.009, AR(2)=0.068, Hansen=(0.101).

Attempts to classify some variables as endogenous turned out worst results with the Hansen test reaching the 'unacceptable' 1.000 mark, so I had to just classify them as 'weakly exogenous'.

Is it ok to accept this result and justify it by saying that: 'I cannot reject AR(2) at 5% significance level' or is there a way of correcting for AR(2)?

I have attached the Stata output and will greatly appreciate all contributions.
Ngozi
Attached Files

2-Step Diff GMM - Stata'a Raw Output.docx (118.5 KB, 1 view)
Tags: 2-Step Difference GMM
Sebastian Kripfganz

Join Date: May 2014

Posts: 2568
#2

17 Aug 2014, 02:54

Ngozi: You are misinterpreting the result from the AR(2) test. The null hypothesis is: "NO autocorrelation of order 2". Because the p-value is 0.068 > 0.05, you cannot reject the null hypothesis of no autocorrelation at the 5% significance level.

https://www.kripfganz.de/stata/
Comment

�
Ngozi ADELEYE

Join Date: Apr 2014

Posts: 80
#3

17 Aug 2014, 10:56

Sebastian,

thanks for your response but I feel we're on the same page....saying the same thing in different ways ....right? So, now that you have clarified this, is it ok to accept this result ?
Comment

�
Sebastian Kripfganz

Join Date: May 2014

Posts: 2568
#4

17 Aug 2014, 12:02

Since there is no evidence (at the 5% significance level) for second-order serial correlation of the error term, there is no need to correct for it and you could use lagged levels of the dependent variable from lag 2 onwards as instruments for the transformed equation. This would be gmm(L.lnhom) or gmm(lnhom, lag(2 .)), both of which are equivalent because xtabond2 by default lags the specified variable once for GMM-style instruments. Why did you choose gmm(L3.lnhom)?

https://www.kripfganz.de/stata/
Comment

�
David Roodman

Join Date: Jul 2014

Posts: 465
#5

18 Aug 2014, 12:19

I disagree with Sebastian. I wouldn't take much comfort in a p value of 0.068. This is saying that if there is no second-order serial correlation in differences--removing a threat to the validity of twice lagged instruments--there's only a 6.8% chance you'd get a Arellano-Bond z statistic that large. So you are right to use deeper lags--that's how you work around the bad AR() test result. But you should also include an artests(3) option to check for longer serial correlation in the same way.
Comment

�
Sebastian Kripfganz

Join Date: May 2014

Posts: 2568
#6

18 Aug 2014, 13:21

I agree that the AR(2) result is anything but comfortable, but the "degree of comfort" is given by the confidence level which has to be chosen by the researcher. By using deeper lags there is a trade-off between the validity of the instruments (robustness to serial correlation) and the strength of the instruments (decreasing correlation with larger distance in time). There is no black and white here, and it might be good to report the results of different specifications in applied work.

https://www.kripfganz.de/stata/
Comment

�
Ngozi ADELEYE

Join Date: Apr 2014

Posts: 80
#7

20 Aug 2014, 07:38

Hi Sebastian/David,

Following your guidelines, I did the test again using gmm(lnhom, lag (2 .)) and adding artests(4) to the earlier options used. Using 86 instruments, I got the following results:

Arellano-Bond test for AR(1) in first differences: z = -3.02 Pr > z = 0.002
Arellano-Bond test for AR(2) in first differences: z = 2.06 Pr > z = 0.040
Arellano-Bond test for AR(3) in first differences: z = -2.22 Pr > z = 0.027
Arellano-Bond test for AR(4) in first differences: z = 1.37 Pr > z = 0.170
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(65) = 214.77 Prob > chi2 = 0.000
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(65) = 72.99 Prob > chi2 = 0.232
(Robust, but weakened by many instruments.)

The truth is this outcome is okay as the point estimate on the lagged dependent variable is within the range (between the FE and OLS point estimates) and the standard error is just appropriate. Is it okay to report just the AR(4) and Hansen statistic alone?

Thanks a lot for helping!
Ngozi
Comment

�
Sebastian Kripfganz

Join Date: May 2014

Posts: 2568
#8

20 Aug 2014, 10:09

With these results, using gmm(lnhom, lag(2 .)) is indeed no longer valid because you cannot reject serial correlation up to the third order. I would thus indeed suggest to follow David's advise and use only lags starting at depth 4, that is gmm(lnhom, lag(4 .)), or gmm(L3.lnhom) as you specified in your initial post. But maybe the "truth" lies in between. I would suggest to try also gmm(lnhom, lag(3 .)) and to have a look if the p-values for the AR(3) and AR(4) tests are in this case "comfortable".

An alternative and probably more elegant solution than playing around with the instruments might be to add an additional lag of the dependent variable and maybe also the exogenous regressors to the estimation equation, which is the common way in the time series literature to account for the serial correlation.

https://www.kripfganz.de/stata/
Comment

�
Ngozi ADELEYE

Join Date: Apr 2014

Posts: 80
#9

20 Aug 2014, 12:03

Thanks Sebastian, I actually tried gmm(lnhom, lag(3 .)) and gmm(lnhom, lag(4 .)) the results weren't any better - both AR(2) and Hansen test were below 0.10. But I will explore all the various specifications given.

Once again thanks for taking the time to teach. I appreciate.
Comment

�
Chandan Jha

Join Date: Jul 2015

Posts: 3
#10

26 Jul 2015, 11:43

I have a similar issue. The AR(2) is statistically significant (p = 0.000). In this case, do I have to control for the second lag of the dependent variable, or do I just specify gmm(y, lag(2 .))? When I used the second lag of the dependent variable as a regressor, the coefficient of the second lag is statistically significant (p = 0.00), but AR(2) is not (p >0.5). I am using system GMM. Any help will be greatly appreciated.

Last edited by Chandan Jha; 26 Jul 2015, 12:11.
Comment

�
Daisy Nguyen

Join Date: Jan 2016

Posts: 3
#11

02 Feb 2016, 09:47

Hi all,
Could you please help me with these issues:

(1) when I try deeper lags for gmm-style IVs, the AR(n) test(s) are passed (say: we do not reject the null that there is no serial correlation), at lag(3): Hansen Test p value >0.05 --> we do not reject the validity of joint-IVs, but at lag(4), Hansen test p value <0.05 --> we reject the validity of joint_IVs. It happens the same with lag(5) : not reject null of Hansen Test, and lag(6) reject null.
--> How can we decide which lags we should choose in this case?

(2)What is the optimal lags length? As when we increase the lags length, we weaken the Hansen test (missing obs) as explained by Roodman (2009). Is using collapse, say: gmmstyle (X Y, collapse), is the best way? Could you explain for me clearer?

Thanks a lot in advance!
Comment

�
Hoang Luong

Join Date: Jul 2015

Posts: 10
#12

02 May 2016, 07:20

Dear all!
I'm Hoang Luong, PhD student at The University of Greenwich, London.
I have a problem with AR(2) too. But I don't really understand the correction process here. So do I have to use deeper lags for all the endogenous variables, including control variables. Because in my case, if I use from lag 3 for only my main variables, the results are acceptable. But if I use for all including my control variables, it is impossible to have the significant results (please see the attached file below).

Also, the second question is that if I have the square term of an endogenous variable, should I put it in gmmstyle also, or should I treat it as an exogenous variable, cause all the effect of error terms can be instrumented to the own endogenous variable and we don't need to take care of the square term.
I'm looking forward to any suggestion or guide here.
Thank you in advance for reading.

Best regards!
Hoang Luong

Attached Files

results_1.docx (154.9 KB, 1 view)
Comment

�
Gabriel Nomotsu

Join Date: Jun 2016

Posts: 1
#13

17 Jun 2016, 12:34

Please I have run system gmm and AR(2) is giving me dot. Any problem? How do I solve it if any?
Comment

�
Mohamed ayoub daghari

Join Date: Apr 2017

Posts: 1
#14

18 Apr 2017, 07:09

Dear all
Group variable: id_e Number of obs = 185
Time variable : year Number of groups = 37
Number of instruments = 99 Obs per group: min = 1
Wald chi2(6) = 80062.48 avg = 5.00
Prob > chi2 = 0.000 max = 7

Arellano-Bond test for AR(1) in first differences: z = 0.77 Pr > z = 0.440
Arellano-Bond test for AR(2) in first differences: z = -1.14 Pr > z = 0.256

can someone tell me how to reduce AR(1) ?
Comment

�
Nursena Sagir

Join Date: Jan 2022

Posts: 25
#15

30 Jan 2022, 06:05

Dear Sebastian,

I have a similar problem. In my model I found AR(2) p value is 0.0685. However, AR(3) and AR(4) produce good results. If I follow your quote below (use only lags at depth 3) all AR() stats looks great. All results stay same except the lag of dependent variable which became insignificant.

Code:

Arellano-Bond test for autocorrelation of the first-differenced residuals H0: no autocorrelation of order 1: z = -13.5887 Prob > |z| = 0.0000 H0: no autocorrelation of order 2: z = 1.8217 Prob > |z| = 0.0685 H0: no autocorrelation of order 3: z = -0.7416 Prob > |z| = 0.4584 H0: no autocorrelation of order 4: z = 0.4485 Prob > |z| = 0.6538

Originally posted by Sebastian Kripfganz View Post

use only lags starting at depth 4, that is gmm(lnhom, lag(4 .)), or gmm(L3.lnhom) as you specified in your initial post. But maybe the "truth" lies in between. I would suggest to try also gmm(lnhom, lag(3 .)) and to have a look if the p-values for the AR(3) and AR(4) tests are in this case "comfortable".

While results are changing in different specifications, is it safe to use lags at depth 2?

Best regards,
Nursena
Comment

�

Announcement

How do you correct for AR(2) in 2-Step Difference GMM?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment