XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

Tiyo Ardiyono

Join Date: Mar 2021

Posts: 8
#391

17 Apr 2022, 23:52

Sorry, this is the error
Attached Files
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#392

18 Apr 2022, 01:32

I am unable to replicate this error message. I do not see how the option teffects can cause such a problem. Would it be possible for you to send me your data set by e-mail, so that I can investigate further?

https://www.kripfganz.de/stata/
Comment
Sarah Magd

Join Date: Feb 2022

Posts: 60
#393

25 Apr 2022, 05:44

Dear Prof. Sebastian Kripfganz,
I have grouped my variables into endogenous and predetermined variables. However, I have been asked to test for the exogeneity and predeterminant of these variables (in other words, to provide evidence that my variables are correctly specified and grouped). Could you please guide us on how we can test whether the grouping of variables is correct or not with xtdpdgmm? We would highly appreciate it if you could provide an example with a code.
Comment
Sarah Magd

Join Date: Feb 2022

Posts: 60
#394

26 Apr 2022, 02:56

Dear Prof. Sebastian Kripfganz,
I am estimating a two-step system GMM model with l.Y X2 X3 X4 being my endogenous variables. The results of this model are consistent with the literature. However, when I estimate a two-step difference GMM model, the results become inconsistent, especially for the lagged dependent variable. Could you please help me figure out the problem?
My sample ranges from 2005 to 2017.

################################################## ################################################## #########
#Two step sys GMM model
################################################## ################################################## ##########
. xtdpdgmm L(0/1).Y X1 X2 X3 X4 X5 X6, model(diff) collapse gmm(l.Y X2 X3 X4, lag(1 3)) gmm( X1 X5 X6, lag(1 3)) gmm(Y X2 X3 X4, l
> ag(1 1) diff model(level)) gmm(X1 X5 X6, lag(0 0) diff model (level)) two vce(r) overid noconstant

Moment conditions: linear = 28 Obs per group: min = 10
nonlinear = 0 avg = 11.85714
total = 28 max = 12

(Std. Err. adjusted for 28 clusters in iso_num)
------------------------------------------------------------------------------
| WC-Robust
Y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Y |
L1. | .8335899 .0610488 13.65 0.000 .7139365 .9532433

X1 | -.0820382 .0266326 -3.08 0.002 -.1342371 -.0298393
X2 | .0649644 .0245804 2.64 0.008 .0167876 .1131411
X3 | -.0943125 .0459489 -2.05 0.040 -.1843707 -.0042544
X4 | -.0172625 .0110394 -1.56 0.118 -.0388993 .0043742
X5 | -.0656854 .0300032 -2.19 0.029 -.1244907 -.0068801
X6 | .1170248 .0508016 2.30 0.021 .0174555 .2165942
------------------------------------------------------------------------------
Instruments corresponding to the linear moment conditions:
1, model(diff):
L1.L.Y L2.L.Y L3.L.Y L1.X2 L2.X2 L3.X2 L1.X3 L2.X3 L3.X3 L1.X4 L2.X4 L3.X4
2, model(diff):
L1.X1 L2.X1 L3.X1 L1.X5 L2.X5 L3.X5 L1.X6 L2.X6 L3.X6
3, model(level):
L1.D.Y L1.D.X2 L1.D.X3 L1.D.X4
4, model(level):
D.X1 D.X5 D.X6

. estat overid

Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid

2-step moment functions, 2-step weighting matrix chi2(21) = 27.4217
Prob > chi2 = 0.1573

2-step moment functions, 3-step weighting matrix chi2(21) = 28.0000
Prob > chi2 = 0.1402

. estat serial, ar(1/3)

Arellano-Bond test for autocorrelation of the first-differenced residuals
H0: no autocorrelation of order 1: z = -3.7914 Prob > |z| = 0.0001
H0: no autocorrelation of order 2: z = 1.6378 Prob > |z| = 0.1015
H0: no autocorrelation of order 3: z = 0.9998 Prob > |z| = 0.3174

################################################## ################################################## #########
#Difference GMM model
################################################## ################################################## ##########

. xtdpdgmm L(0/1).Y X1 X2 X3 X4 X5 X6, model(diff) collapse gmm(l.Y X2 X3 X4, lag(1 5)) gmm( X1 X5 X6, lag(1 3)) two vce(r) overid
> noconstant

Generalized method of moments estimation

Fitting full model:
Step 1 f(b) = .00222498
Step 2 f(b) = .86843655

Fitting reduced model 1:
Step 1 f(b) = .00147315

Fitting reduced model 2:
Step 1 f(b) = .70379905

Group variable: iso_num Number of obs = 332
Time variable: year Number of groups = 28

Moment conditions: linear = 29 Obs per group: min = 10
nonlinear = 0 avg = 11.85714
total = 29 max = 12

(Std. Err. adjusted for 28 clusters in iso_num)
------------------------------------------------------------------------------
| WC-Robust
Y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Y |
L1. | .0188738 .1174797 0.16 0.872 -.2113823 .2491298
|
X1 | -.1169446 .0221783 -5.27 0.000 -.1604133 -.073476
X2 | .7521403 .1463045 5.14 0.000 .4653888 1.038892
X3 | -.0511113 .1211055 -0.42 0.673 -.2884738 .1862512
X4 | -.0011518 .0082276 -0.14 0.889 -.0172776 .014974
X5 | .0371346 .0451025 0.82 0.410 -.0512648 .1255339
X6 | -.3401888 .1308523 -2.60 0.009 -.5966546 -.0837231
------------------------------------------------------------------------------
Instruments corresponding to the linear moment conditions:
1, model(diff):
L1.L.Y L2.L.Y L3.L.Y L4.L.Y L5.L.Y L1.X2 L2.X2 L3.X2 L4.X2 L5.X2 L1.X3
L2.X3 L3.X3 L4.X3 L5.X3 L1.X4 L2.X4 L3.X4 L4.X4 L5.X4
2, model(diff):
L1.X1 L2.X1 L3.X1 L1.X5 L2.X5 L3.X5 L1.X6 L2.X6 L3.X6

. estat overid

Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid

2-step moment functions, 2-step weighting matrix chi2(22) = 24.3162
Prob > chi2 = 0.3309

2-step moment functions, 3-step weighting matrix chi2(22) = 28.0000
Prob > chi2 = 0.1757

. estat serial, ar(1/3)

Arellano-Bond test for autocorrelation of the first-differenced residuals
H0: no autocorrelation of order 1: z = -2.2317 Prob > |z| = 0.0256
H0: no autocorrelation of order 2: z = -0.8144 Prob > |z| = 0.4154
H0: no autocorrelation of order 3: z = 1.2586 Prob > |z| = 0.2082

Last edited by Sarah Magd; 26 Apr 2022, 02:58.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#395

26 Apr 2022, 05:40

Originally posted by Sarah Magd View Post

I have grouped my variables into endogenous and predetermined variables. However, I have been asked to test for the exogeneity and predeterminant of these variables (in other words, to provide evidence that my variables are correctly specified and grouped). Could you please guide us on how we can test whether the grouping of variables is correct or not with xtdpdgmm? We would highly appreciate it if you could provide an example with a code.

In my 2019 London Stata Conference presentation, I demonstrate starting on slide 90 how to empirically classify the variables as endogenous, predetermined, or strictly exogenous. You would begin with classifying all variables as endogenous. Then you can add further instruments which are valid if the variables are predetermined. Incremental overidentification tests can be used to check whether those additional instruments are valid, and therefore if the variables are indeed predetermined. Later, you can do the same for strict exogeneity, by adding the respective additional instruments and checking the incremental overidentification tests again.

In your example with the system and difference GMM estimator, I would recommend to use the same lag orders for both. Using lag(1 3) for the system GMM estimator but lag(1 5) for the difference GMM estimator gives the impression to a reader of your analysis that you are cherry picking results. Furthermore, it is highly unusual to not include a regression constant when using the system GMM estimator. This might explain the observed differences. Aside from that, the difference GMM estimator might suffer from an identification problem if the true value (not the estimated value) of the lagged dependent variable's coefficient is close to 1. You could try adding the nl(noserial) option to the difference GMM estimator. This gives you the Ahn-Schmidt estimator, which could be of help avoiding this identification problem of the difference GMM estimator.

https://www.kripfganz.de/stata/
Comment
Sarah Magd

Join Date: Feb 2022

Posts: 60
#396

26 Apr 2022, 08:15

Dear Prof. Sebastian Kripfganz,
Thank you very much for your reply. I have two other questions.

1. May you please clarify what the difference is between the following two commands?

1.1. xtdpdgmm L(0/1).ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr ln_fd_1 hc ln_Total_2 , model(diff) collapse gmm(l.ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr, lag(1 3)) gmm( ln_fd_1 hc ln_Total_2 , lag(1 3)) gmm(ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr, lag(1 1) diff model(level)) gmm( ln_fd_1 hc ln_Total_2 , lag(0 0) diff model (level)) two vce(r) overid noconstant

1.2. xtdpdgmm L(0/1).ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr ln_fd_1 hc ln_Total_2, collapse gmm (l.ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr, lag(1 3) model(diff)) gmm(ln_fd_1 hc ln_Total_2, lag(1 3) model(diff)) gmm(l.ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr, lag(1 1) diff) gmm(ln_fd_1 hc ln_Total_2, lag(0 0) diff) two vce(r)

#########################
2. my sample size is 332 observations with the time period from 2005 to 2017. when I estimate the two-step GMM model, it passes the Hansen test. However, I get the following result for the Hansen test from the one-step GMM model (i.e., the system GMM). I do not understand why I have this result with the one-step model. Would you please guide me on this?

Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid

1-step moment functions, 1-step weighting matrix chi2(21) = 99.5061
note: * Prob > chi2 = 0.0000

1-step moment functions, 2-step weighting matrix chi2(21) = 28.0000
note: * Prob > chi2 = 0.1402
###########################

Also, given my sample size, should I use the difference GMM model for robustness checks?

Last edited by Sarah Magd; 26 Apr 2022, 08:21.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#397

26 Apr 2022, 08:52

1. As far as I can tell from glancing over the two command lines, the only difference seems to be that the first command has the additional options overid noconstant. Thus, it calculates additional incremental overidentification statistics that can be displayed with the postestimation command estat overid, difference. And it constrained the constant to be zero.

2. The Sargan-Hansen tests after the one-step system GMM estimation are based on a weighting matrix which is not optimal. Those tests are asymptotically invalid. I would not expect too much from them either with your very small number of groups.

Normally, you would start with the difference GMM estimator and then use the system GMM estimator as a robustness check, or to improve on the estimates if you suspect identification problems with the difference GMM estimator.

https://www.kripfganz.de/stata/
Comment
Sarah Magd

Join Date: Feb 2022

Posts: 60
#398

28 Apr 2022, 05:05

Dear Prof. Sebastian Kripfganz,
1- I have included the first difference of a variable in the main equation of my regression. Can I include this variable in level as an instrumental variable in the differential equation?
2- if we include the logarithmic transformation of a variable, can we include its level form as an instrument?
so the two questions are related to whether we can use a different variable as an instrument or do we have to use the same variable as an instrument in the gmm?
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#399

28 Apr 2022, 05:10

You can use as an instrument whatever variable you have. The only requirements are that they are sufficiently strongly correlated with the instrumented variable and that they are uncorrelated with the error term.

https://www.kripfganz.de/stata/
Comment
Sarah Magd

Join Date: Feb 2022

Posts: 60
#400

28 Apr 2022, 05:17

Dear Prof. Sebastian Kripfganz,
Thank you very much for your prompt reply.
This means that I can use any instrument as long as the AR(2) and Hansen test statistics are statistically insignificant. Am I right?
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#401

28 Apr 2022, 06:22

In short, yes. Ideally, the statistical tests should be complemented by theoretical reasoning.

https://www.kripfganz.de/stata/
Comment
Anuradha Saikia

Join Date: Aug 2020

Posts: 153
#402

20 May 2022, 08:36

Hello Sebastian Kripfganz . How to decide between xtabond2 and xtdpdgmm as the suitable code
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#403

20 May 2022, 08:59

For many situations, xtdpdgmm and xtabond2 are equivalent. Both can compute the conventional difference-GMM and system-GMM estimators. For many people, the choice between the commands may be a matter of taste. The syntax of the commands can become quite complicated given all the available options. xtdpdgmm follows the approach that you get what you type, which requires the user to have a good understanding of the econometric methods. (It is encouraged in any case that you understand the econometric methods you are using and do not just rely on some default software settings.) With xtabond2, you might have to be a bit more careful - at least in my opinion - that you really get what you want when specifying the options.

xtdpdgmm has a few extensions which you do not find in xtabond2, such as nonlinear moment conditions, the flexibility to use the within-groups transformation (analogously to the traditional fixed-effects estimator), the iterated GMM estimator, and a few additional postestimation statistics.

I am obviously not impartial. You might just give both commands a try and see how comfortable you are with them. (Feel free to feed back your experience here.) An advantage of having both commands is that you can double check your results. If implemented correctly, both commands should give you the same results.

https://www.kripfganz.de/stata/
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#404

30 May 2022, 09:42

While I was doing some code improvements (which led to some speed improvement, especially when the command is called for the first time during a Stata session), I discovered a bug in the prediction of scores - postestimation command predict with option score - when xtdpdgmm computed robust standard errors with the Windmeijer correction. The WC-robust standard errors displayed in the regression output were correct, but the postestimation commands relying on the predicted scores - such as estat serial, estat hausman, suest etc. would produce slightly incorrect results. (The differences are not expected to be large.)

This bug has been corrected in the latest version 2.3.11, which is now available on my website:

Code:

net install xtdpdgmm, from(http://www.kripfganz.de/stata) replace

Note for replication purposes: This update might lead to slightly different results when trying to replicate previous results for the Arellano-Bond test and the generalized Hausman test.

https://www.kripfganz.de/stata/
Comment
Sarah Magd

Join Date: Feb 2022

Posts: 60
#405

31 May 2022, 02:55

Dear Prof. Kripfganz,

I am regressing the log of GDP per capita (ln_GDPc) on renewable energy consumption (REN), using the two-step sys GMM model for a sample of 26 countries over the period 2005 to 2017. The panel unit root analysis shows that ln_GDPc is stationary in first difference, whereas REN is stationary in level. This leads to two cases:
- Case 1: If I regress ln_GDPc in level on REN: the coefficient of ln_GDPc is 0.3 and statistically significant.
- Case 2: If I regress ln_GDPc in the first difference (Δln_GDPc) on REN: the coefficient of Δln_GDPc becomes 0.9 and statistically significant.

In this context, are the two cases equivalent to each other? Should I care about the stationarity given the small-time period I have?
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment