System GMM with two step

Pidi Apriyadi

Join Date: Jan 2024

Posts: 7
#1

System GMM with two step

29 Jan 2024, 20:57

Hi there

I have a problem which I show below:

Warning: gmm two-step standard errors are biased; robust standard
errors are recommended.
Instruments for differenced equation
GMM-type: L(2/2).tr
Standard: D.msm2 D.birate D.geii D.gii D.ginflasi D.gkurs D.gsaving
D.gnab D.cf D.usia D.gfixedin
Instruments for level equation
GMM-type: LD.tr
Standard: _cons

This is the result of system-GMM two step. I have been warned that I am recommend using robust standar error. When I type vce(robuts) in the model, stata can not calculate Sargan's test.

Sargan test of overidentifying restrictions
H0: overidentifying restrictions are valid
cannot calculate Sargan test with vce(robust)

chi2(208) = .
Prob > chi2 = .

How Should I solve this problem?

Last edited by Pidi Apriyadi; 29 Jan 2024, 21:12.
Tags: None
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#2

02 Feb 2024, 02:43

Without seeing at least your full command line and further information about the number of observations and instruments, there is not much we can do to help. My xtdpdgmm command might be a suitable workaround for your problems:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

https://www.kripfganz.de/stata/
Comment
Pidi Apriyadi

Join Date: Jan 2024

Posts: 7
#3

14 Feb 2024, 05:05

thank you for the response and the recomendation.

Let me to show the result of the regression.

xtdpdsys price lnvol gbirate gms gsaving gcicil ginf gkurs gihsg cce1 eci5, lags(1) maxldep(2) two vce(robust)

System dynamic panel-data estimation Number of obs = 7,894
Group variable: id Number of groups = 72
Time variable: period
Obs per group:
min = 85
avg = 109.6389
max = 113

Number of instruments = 344 Wald chi2(11) = 205.25
Prob > chi2 = 0.0000
Two-step results

Sargan test after apply vce(robust):

Sargan test of overidentifying restrictions
H0: overidentifying restrictions are valid
cannot calculate Sargan test with vce(robust)

chi2(332) = .
Prob > chi2 = .

but,if I do not apply vce(robust) in the command, I am noticed which I show below :

Warning: gmm two-step standard errors are biased; robust standard
errors are recommended.

Sargan test without applying vce(robust) in the two step sys gmm
Sargan test of overidentifying restrictions
H0: overidentifying restrictions are valid

chi2(332) = 59.67847
Prob > chi2 = 1.0000

Last edited by Pidi Apriyadi; 14 Feb 2024, 05:11.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#4

15 Feb 2024, 07:50

The number of instruments is way too large relative to the sample size. You need to reduce their number; see my presentation linked above.

https://www.kripfganz.de/stata/
Comment
Pidi Apriyadi

Join Date: Jan 2024

Posts: 7
#5

17 Feb 2024, 09:09

Thank you Prof.Sebastian for your response. I have used your command xtdpdgmm in my work. The result has made me more confidence than before. But I want to make a sure about my command.

Would you mind evaluating my command below, please?

xtdpdgmm L(0/1).price lnvol gbirate gms gcei giec gconsump gcicil gkurs ginf gihsg goblig10, model(fodev) collapse gmm(price gprice lnvol gihsg, lag(2 4)) gmm(lnvol gihsg gbirate gms gcei giec gconsump gcicil gkurs ginf goblig10, lag(1 4)) gmm(price gprice lnvol gihsg, lag(1 1) diff model(level)) gmm(lnvol gihsg gbirate gms gcei giec gconsump gcicil gkurs ginf goblig10, lag(0 0) diff model(level)) two vce(r)

Generalized method of moments estimation

Fitting full model:
Step 1 f(b) = .82163568
Step 2 f(b) = .86132161

Group variable: id Number of obs = 7894
Time variable: period Number of groups = 72

Moment conditions: linear = 66 Obs per group: min = 85
nonlinear = 0 avg = 109.6389
total = 66 max = 113

(Std. Err. adjusted for 72 clusters in id)

Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid

2-step moment functions, 2-step weighting matrix chi2(53) = 62.0152
Prob > chi2 = 0.1856

2-step moment functions, 3-step weighting matrix chi2(53) = 68.0022
Prob > chi2 = 0.0804

Arellano-Bond test for autocorrelation of the first-differenced residuals
H0: no autocorrelation of order 1 z = -6.2725 Prob > |z| = 0.0000
H0: no autocorrelation of order 2 z = -0.4966 Prob > |z| = 0.6195

Honestly, I have a concern about the command which I show in bold text :

......................, collapse gmm(price gprice lnvol gihsg, lag(2 4)) gmm(lnvol gihsg gbirate gms gcei giec gconsump gcicil gkurs ginf goblig10, lag(1 4)) gmm(price gprice lnvol gihsg, lag(1 1) diff model(level)) gmm(lnvol gihsg gbirate gms gcei giec gconsump gcicil gkurs ginf goblig10, lag(0 0) diff model(level))

1. May I use several variables in the command which is in the bold text? The variables which are included are the independent variable as well.
2. I used a variable which does not include in the model because it has an impact on dependent variable (variable: gprice). Is it accepted?
3. Does it allow me to include all independent variables as instrument?

Last edited by Pidi Apriyadi; 17 Feb 2024, 09:20.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#6

19 Feb 2024, 02:17

You can include as many variables as you want in the set of instruments, as long as they are valid and relevant. Note that, with model(fodev) (but not with model(diff)), you could also use lag 1 as an instrument; i.e., lag(1 4) would be okay as well. Assuming the variables in the second gmm() option are predetermined, with model(fodev) you could use lag(0 4). This would yield stronger instruments.

You can include further variables as instruments that are not included as regressors. However, they should not have a direct effect on the dependent variable (after controlling for all other regressors); this would make them invalid.

As before, you can include any instrument that is valid and relevant.

To get started, you might want to have a look at the xtdpdgmmfe command, which is part of the xtdpdgmm package. This may help you find a good syntax based on a set of assumptions you specify.

You seem to have a large number of time periods in your data set. I am a bit concerned that this might have an adverse effect on inference with the two-step estimator, because the estimated weighting matrix becomes very large. Maybe have a look at Jan Ditzen's xtdcce2 command, which might be better suited for this type of data.

https://www.kripfganz.de/stata/
Comment
Pidi Apriyadi

Join Date: Jan 2024

Posts: 7
#7

24 Feb 2024, 03:12

Thanks Prof.Sebastian.

Sure, actually I have 72 firms and 120 time periods (monthly). Is my data not ideal to run xtdpdgmm even if I run the regression into level industry? (such as for bank industry, I have 30 banks and 120 time periods)

I have tried to run xtdcce2 command as you suggest. However, I have a problem about my stata version to run xtdcce.

Would you mind suggesting me about any estimation that will fit based on my data?
Comment
Pidi Apriyadi

Join Date: Jan 2024

Posts: 7
#8

07 May 2024, 01:11

Dear Prof. Sebastian Kripfganz

I am currently using your xtdpdgmm command in my paper. But, I have not been sure about recognizing the endogenous, the strictly exogenous, and the predetermined variabel as instruments.

This is my command based on your xtdpdgmm

xtdpdgmm L(0/1).zakat ikk5 dist consump inf, model(fodev) collapse gmm(zakat,lag(1 2)) gmm(ikk5 dist, lag(0 2)) gmm(zakat, lag(1 2) diff model(level)) gmm(ikk5 dist, lag(0 2) diff model(level)) two vce(r)

Would you mind mentioning about endogenous, strictly exogenous, and predetermined variabel as instruments based on the following my command above? because I don't find the notes about that in your file presentation.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#9

07 May 2024, 03:57

Strictly exogenous variables are those that are uncorrelated with the time-varying error component at any point in time. Predetermined variables are those that might be correlated with past errors but are uncorrelated with contemporaneous and future errors. Endogenous variables (in this context) are those that might be correlated with past and contemporaneous errors but are uncorrelated with future errors. [Formal definition on slide 9 of my 2019 London Stata Conference presentation.]

Initially, you should consider what the underlying economic theory suggests. It there reason to assume dynamic feedback from the dependent variable (past errors) to the regressor? Then this would call for the assumption of predetermined variables. If the dependent variable and regressor are simultaneously determined, or if you suspect reverse causality, then the regressor should be treated as endogenous.

You could then follow an empirical strategy to determine the variable classification as outlined on slides 90 and following in my presentation.

If you find it difficult to specify the instruments with the xtdpdgmm options once you have decided about the variable classification, you could also use the xtdpdgmmfe command, which is part of the xtdpdgmm package. This allows you to directly specify which variables are endogenous, predetermined, or exogenous. It also shows how the implied xtdpdgmm command line looks like, which you can then adapt further. Please see the xtdpdgmmfe help file for details and examples.

https://www.kripfganz.de/stata/
Comment
Pidi Apriyadi

Join Date: Jan 2024

Posts: 7
#10

14 Jun 2024, 19:27

Thank you, Prof. Sebastian
Comment

Announcement

System GMM with two step

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment