How to conduct GMM style regression anaylsis

Abdullah Ijaz

Join Date: Sep 2017

Posts: 97
#1

How to conduct GMM style regression anaylsis

18 Feb 2022, 03:49

Hi.

I know this question has been asked numerous times, I also did read roodman and blundell, searched for the answers on stata forum. However, I am still confused about the model. I don't have econometrics background so maybe that's why. But I am just writing down the command here and asking my queries, if someone kind enough to explain that to me.

So I am using two-step system GMM for my model.

xtabond2 Y Y(t-1) X control vars i.Year, gmm (Y(t-1)) iv ( control vars i.Year, equation(level)) nodiffsargan twostep robust orthogonal small

Here are the questions:

1) In gmm() , what sort of variable can we put there? endogenous to what? In this case, I am using the first lag of my dependent variable, Does it make sense?
2) In iv() can we put control vars and i.Year ? Again these vars are endogenous to what?

Also, is the command xtabond2 correct to be used in this way? I know this is not the place to ask such question because it may seem like a assignment type of a question, but it is more about understanding the dynamics of the model in simple terms for research? I know theory has to play a big role in term of finding endogenous variables, but I just want to know the basics of the model.

Any comments and feedback shall be appreciated.
Tags: None
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#2

18 Feb 2022, 04:12

1) There can be many reasons why a variable is endogenous, e.g. simultaneity (i.e. feedback from the dependent variable to the regressors). In panel data, a common issue is the potential correlation of the regressors with the unobserved group-specific effects. For the lagged dependent variable, this correlation exists by construction of the model, which is why it is usually instrumented with GMM-style instruments.

2) Any variables you put in your iv() option for the level equation are assumed to be uncorrelated with the unobserved group-specific effects (and exogenous with regard to any other unobserved variables). This is comparable to a "random effects" assumption for these variables.

If your control variables are strictly exogenous and uncorrelated with the unobserved group-specific effects, then your specification could be correct.

More on the GMM estimation of linear dynamic panel data models in Stata:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

https://www.kripfganz.de/stata/
Comment
Abdullah Ijaz

Join Date: Sep 2017

Posts: 97
#3

15 Apr 2022, 05:30

Many thanks for your valuable feedback.
Comment
Adam Abdulrahman

Join Date: May 2022

Posts: 5
#4

26 May 2022, 06:01

Hello! I'm currently running GMM tests in Stata but I keep getting both a very high number of instruments (100-200+) and an even higher Wald Chi2 score (1.71e+08).

Below is the code that I ran and its following output:

xtdpdsys PFAGDP LNGDPPcap SDGDum LNGDPPcapxSDGDum DependencyRatio Inflation CMReturns PopGrowth LFParticRate, lags(1) twostep artests(2).
System dynamic panel-data estimation Number of obs = 603
Group variable: Country Number of groups = 35
Time variable: Year
Obs per group:
min = 1
avg = 17.22857
max = 20

Number of instruments = 218 Wald chi2(9) = 1.71e+08
Prob > chi2 = 0.0000
Two-step results
----------------------------------------------------------------------------------
PFAGDP | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
PFAGDP |
L1. | . 9695742 .003495 277.42 0.000 .9627241 .9764244
|
LNGDPPcap | 35.98271 1.9354 18.59 0.000 32.18939 39.77602
SDGDum | 16.02427 5.916328 2.71 0.007 4.428483 27.62006
LNGDPPcapxSDGDum | -3.615439 1.266003 -2.86 0.004 -6.096759 -1.134119
DependencyRatio | 3.062755 .2688706 11.39 0.000 2.535778 3.589732
Inflation | -.1901946 .0337445 -5.64 0.000 -.2563326 -.1240565
CMReturns | 11.93403 .2071649 57.61 0.000 11.528 12.34007
PopGrowth | -2.534041 .3077528 -8.23 0.000 -3.137226 -1.930857
LFParticRate | -.3305291 .0430299 -7.68 0.000 -.4148661 -.246192
_cons | -152.4423 8.000879 -19.05 0.000 -168.1238 -136.7609
----------------------------------------------------------------------------------
Warning: gmm two-step standard errors are biased; robust standard
errors are recommended.
Instruments for differenced equation
GMM-type: L(2/.).PFAGDP
Standard: D.LNGDPPcap D.SDGDum D.LNGDPPcapxSDGDum D.DependencyRatio
D.Inflation D.CMReturns D.PopGrowth D.LFParticRate
Instruments for level equation
GMM-type: LD.PFAGDP
Standard: _cons

When i tried running the following code, i was able to lower the number of instruments and Wald Chi2 score, but both are still relatively high:
xtdpdsys PFAGDP LNGDPPcap SDGDum LNGDPPcapxSDGDum DependencyRatio Inflation CMReturns PopGrowth LFParticRate, lags(1) maxldep(1) maxlags(1) pre(LNGDPPcap, lagstruct(1,1)) artests(2)

xtdpdsys PFAGDP LNGDPPcap SDGDum LNGDPPcapxSDGDum DependencyRatio Inflation CMReturns PopGrowth LFParticRate, lags(1) maxldep(1) maxlags(1) pre(LNGDPPcap, lags
> truct(1,1)) artests(2)
note: LNGDPPcap dropped because of collinearity

System dynamic panel-data estimation Number of obs = 602
Group variable: Country Number of groups = 35
Time variable: Year
Obs per group:
min = 1
avg = 17.2
max = 20

Number of instruments = 85 Wald chi2(10) = 5043.31
Prob > chi2 = 0.0000
One-step results
----------------------------------------------------------------------------------
PFAGDP | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
PFAGDP |
L1. | 1.000401 .0301854 33.14 0.000 .9412387 1.059563
|
LNGDPPcap |
--. | 19.03887 27.25736 0.70 0.485 -34.38458 72.46231
L1. | -13.77005 26.78573 -0.51 0.607 -66.26912 38.72902
|
SDGDum | -8.781053 47.83211 -0.18 0.854 -102.5303 84.96817
LNGDPPcapxSDGDum | 1.667439 10.34791 0.16 0.872 -18.61409 21.94897
DependencyRatio | -1.056532 1.346547 -0.78 0.433 -3.695715 1.582651
Inflation | -.1494649 .2983697 -0.50 0.616 -.7342587 .4353289
CMReturns | 9.951293 2.3603 4.22 0.000 5.32519 14.5774
PopGrowth | -1.691797 1.676526 -1.01 0.313 -4.977727 1.594133
LFParticRate | .0269194 .3792645 0.07 0.943 -.7164253 .7702641
_cons | -18.29174 33.14363 -0.55 0.581 -83.25207 46.66859
----------------------------------------------------------------------------------
Instruments for differenced equation
GMM-type: L(2/2).PFAGDP L(1/1).L.LNGDPPcap
Standard: D.LNGDPPcap D.SDGDum D.LNGDPPcapxSDGDum D.DependencyRatio
D.Inflation D.CMReturns D.PopGrowth D.LFParticRate
Instruments for level equation
GMM-type: LD.PFAGDP LD.LNGDPPcap
Standard: _cons

Any ideas on how to fix this, and also specify the use of only 2 variables (lag of the dependent variable PFAGDP, and LNGDPPcap) as the only instruments?

Thank you so much!
Comment

Announcement

How to conduct GMM style regression anaylsis

Comment

Comment

Comment