Two-Step System GMM with xtabond2 command

Malika Ennajeh

Join Date: Sep 2022

Posts: 6
#1

Two-Step System GMM with xtabond2 command

12 Sep 2022, 17:48

Dear Stata Experts,

I am dealing with a dynamic panel model in which the sample size is 100 firms and the time period is 5 years (2015-2020).

The model is presented below as:

ROA_it=β_0+β_1ROAi_(t-1)+β_3Xit +β_4Cit + αi+ λt + ε_it (1)

WHERE:

DEPENDENT: ROA
ENDOGENEOUS: l.ROA
EXPLANATORY (X): CCC, DPO, DIO, DSO
CONTROL(C): debt_rat, curr_rat, firm_size, wcreq, grwce

All specifications of Eq. (1) are estimated with the GMM estimator system, using the Stata command xtabond2. I am using four study models for each explanatory variable and the results obtained for the first model are presented below:
Model 1

ROA=ROA_t-1+ DSO + CURR_RAT + DEBT_RAT + FIRM_SIZE + WCREQ + GRWCE + l_t

CODE: xtabond2 l(0/1).roa dso curr_rat debt_rat firm_size wcreq grwce yr2015-yr2020, gmm(roa, lag(1 5) collapse eq(diff)) gmm(dio dpo ccc curr_rat debt_rat firm_size wcreq grwce, lag(2 3) collapse equation(diff)) gmm(l2.(d.(dio dpo ccc curr_rat debt_rat firm_size wcreq grwce)), collapse equation(level)) iv( yr2015-yr2020, eq(level)) twostep nodiffsargan small

RESULTS:

xtabond2 l(0/1).roa dso curr_rat debt_rat firm_size wcreq grwce yr2015-yr2020, gmm(roa, lag(1 5)
> collapse eq(diff)) gmm(dio dpo ccc curr_rat debt_rat firm_size wcreq grwce, lag(2 3) collapse equ
> ation(diff)) gmm(l2.(d.(dio dpo ccc curr_rat debt_rat firm_size wcreq grwce)), collapse equation(l
> evel)) iv( yr2015-yr2020, eq(level)) twostep nodiffsargan small
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
yr2015 dropped due to collinearity
yr2019 dropped due to collinearity
Warning: Two-step estimated covariance matrix of moments is singular.
Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: idcompany Number of obs = 500
Time variable : t Number of groups = 100
Number of instruments = 34 Obs per group: min = 5
F(11, 99) = 58911.41 avg = 5.00
Prob > F = 0.000 max = 5 ------------------------------------------------------------------------------
roa | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
roa |
L1. | -.0143849 .0044405 -3.24 0.002 -.0231958 -.0055739
|
dso | .0265805 .0148492 1.79 0.077 -.0028835 .0560445
curr_rat | .0694694 .0726951 0.96 0.342 -.0747734 .2137121
debt_rat | .4320269 .2307227 1.87 0.064 -.0257769 .8898307
firm_size | -.9144968 .3667379 -2.49 0.014 -1.642184 -.1868092
wcreq | .9245246 1.918 0.48 0.631 -2.881203 4.730252
grwce | -2.497403 .8746998 -2.86 0.005 -4.232997 -.7618084
yr2016 | 1.28574 .8517638 1.51 0.134 -.4043443 2.975824
yr2017 | .0122819 .6098236 0.02 0.984 -1.19774 1.222304
yr2018 | -.969782 .5085728 -1.91 0.059 -1.978901 .0393367
yr2020 | -.5374178 .4408665 -1.22 0.226 -1.412193 .337357
_cons | 17.26591 8.101386 2.13 0.036 1.190997 33.34081
------------------------------------------------------------------------------
Warning: Uncorrected two-step standard errors are unreliable.

Instruments for first differences equation
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(2/3).(dio dpo ccc curr_rat debt_rat firm_size wcreq grwce) collapsed
L(1/5).roa collapsed
Instruments for levels equation
Standard
yr2015 yr2016 yr2017 yr2018 yr2019 yr2020
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
DL(1/4).(L2D.dio L2D.dpo L2D.ccc L2D.curr_rat L2D.debt_rat L2D.firm_size
L2D.wcreq L2D.grwce) collapsed
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -1.66 Pr > z = 0.096
Arellano-Bond test for AR(2) in first differences: z = -0.38 Pr > z = 0.706
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(22) = 21.39 Prob > chi2 = 0.497
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(22) = 12.85 Prob > chi2 = 0.937
(Robust, but weakened by many instruments.)

Having not implemented the robust option on my syntax, does it lead to any imperfections on my the results obtained ? Can we still accept the output even without implementing the robust option ?

Please note: When I include robust option on the syntax, I get invalid results from post-estimation diagnostics tests and on my endogeneous as well as my explanatory variables. Therefore, I would like to know whether can I keep the syntax used.

I would really appreciate your feedback on the results obtained with regards to the syntax implemented.

Last edited by Malika Ennajeh; 12 Sep 2022, 18:02.
Tags: dynamic panel model, gmm, panel data, regression, xtabond2
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17561
#2

12 Sep 2022, 23:57

Malika:
welcome to this forum.
Please see Sebastian Kripfganz and David Roodman ' s posts on dynamic panel regression.
What above does not mean to contact them personally (as per FAQ personal contacts are related to well defined issues), but to wait for them to chime in if interested in your thread. Thanks.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Malika Ennajeh

Join Date: Sep 2022

Posts: 6
#3

13 Sep 2022, 05:16

Hi Carlo,
Thanks very much for your response and I do apologise for my inconscience to the FAQ instructions. In the meantime I hope to receive a valuable response on my post indicated above.

Best Regards,

Malika
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2555
#4

13 Sep 2022, 11:07

Non-robust standard errors for two-step system GMM estimators can be severely biased in finite samples. This also effects postestimation test statistics. The Windmeijer correction or doubly corrected standard errors are highly recommended.

You might also find the following presentation useful:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

https://www.kripfganz.de/stata/
1 like
Comment
Malika Ennajeh

Join Date: Sep 2022

Posts: 6
#5

13 Sep 2022, 19:51

Dear Prof Sebastian,

Thanks very much for the valuable input. Having checked the materials, I have redesigned my syntax by inserting the robust option. I would really appreciate your feedback on the extracted output.

The code used: xtabond2 l(0/1).roa dso curr_rat debt_rat firm_size wcreq grwce yr2015-yr2020, gmm(roa, lag(1 2) eq(diff)) gmm(firm_size curr_rat, lag(4 .) collapse equation(diff)) gmm(l3.(d.(firm_size dpo wcreq)), collapse equation(level)) iv(yr2015-yr2020, eq(level))twostep nodiff sargan robust small

Having included the robust option leads to a non-significance of my lag endogenous variable and I am not sure what adjustments can I make to reach a significance level of my endogenous variable. Where can I change my model ?
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2555
#6

14 Sep 2022, 05:55

We should not manipulate our estimators with the primary aim to obtain statistically significant regressors. If there are indications that the model is misspecified or the estimator is inconsistent/inefficient, then we should look for appropriate remedies. Sometimes, this indeed means that estimates turn statistically significant (or the other way round). But statistical insignificance by itself is not a reason to change the model. Absence of evidence of an effect can be a valuable result on its own.

Regarding your model specification: It is a odd to only start with the third or fourth lag for some of your instruments. For an endogenous variable, usually the second lag is already a valid instrument. Higher-order lags tend to become week instruments.

https://www.kripfganz.de/stata/
Comment
Malika Ennajeh

Join Date: Sep 2022

Posts: 6
#7

16 Sep 2022, 07:44

Dear Prof. Sebastian,

I have redesigned my model after following your advice; I have reduced the number of instruments and started with second lag. There results are as follow :

xtabond2 l(0/1).roa dso dpo dio ccc curr_rat debt_rat firm_size wcreq grwce yr2015-yr
> 2020, gmm(roa, lag (2 4) collapse eq(diff)) gmm(dso dpo firm_size wcreq grwce, lag(2
> 3) collapse equation(diff)) gmm(l2.(d.(dso dpo firm_size wcreq grwce)), collapse equa
> tion(level)) iv( yr2015-yr2020, eq(level)) twostep nodiffsargan robust small
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, p
> erm.
yr2015 dropped due to collinearity
yr2017 dropped due to collinearity
Warning: Two-step estimated covariance matrix of moments is singular.
Using a generalized inverse to calculate optimal weighting matrix for two-step estimat
> ion.

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: id Number of obs = 500
Time variable : t Number of groups = 100
Number of instruments = 23 Obs per group: min = 5
F(14, 99) = 8.74 avg = 5.00
Prob > F = 0.000 max = 5
------------------------------------------------------------------------------
| Corrected
roa | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
roa |
L1. | -.0123236 .0056476 -2.18 0.031 -.0235297 -.0011174
|
dso | 1.556837 .5453198 2.85 0.005 .4748039 2.638869
dpo | -1.561013 .5468682 -2.85 0.005 -2.646118 -.4759078
dio | 1.520793 .5344246 2.85 0.005 .4603787 2.581207
ccc | -1.518232 .5364455 -2.83 0.006 -2.582656 -.4538078
curr_rat | -.4057959 .6963625 -0.58 0.561 -1.78753 .9759384
debt_rat | .3142142 1.111338 0.28 0.778 -1.890921 2.51935
firm_size | .0871483 .9170151 0.10 0.924 -1.732409 1.906705
wcreq | -1.970289 1.654254 -1.19 0.236 -5.252687 1.31211
grwce | -.2944963 1.982256 -0.15 0.882 -4.227723 3.63873
yr2016 | .1748829 1.305667 0.13 0.894 -2.415843 2.765609
yr2018 | -1.277724 .9920142 -1.29 0.201 -3.246095 .6906473
yr2019 | .9212488 1.230091 0.75 0.456 -1.519518 3.362015
yr2020 | -.4294958 .922071 -0.47 0.642 -2.259085 1.400093
_cons | .7146384 14.04899 0.05 0.960 -27.1616 28.59088
------------------------------------------------------------------------------
Instruments for first differences equation
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(2/3).(dso dpo firm_size wcreq grwce) collapsed
L(2/4).roa collapsed
Instruments for levels equation
Standard
yr2015 yr2016 yr2017 yr2018 yr2019 yr2020
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
DL(1/4).(L2D.dso L2D.dpo L2D.firm_size L2D.wcreq L2D.grwce) collapsed
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -3.40 Pr > z = 0.001
Arellano-Bond test for AR(2) in first differences: z = 0.52 Pr > z = 0.602
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(8) = 4.04 Prob > chi2 = 0.854
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(8) = 5.85 Prob > chi2 = 0.664
(Robust, but weakened by many instruments.)

Could you please assess the validation and specification of my model ? is there anything misspecified ?
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2555
#8

16 Sep 2022, 10:12

It is still unconventional to have l2.d in the gmm() option for the level model. xtabond2 automatically lags and differences those instruments. It also appears a bit arbitrary that for your first set of instruments, you are using lags up to the 4th, while in your second set of instruments only up the 3rd. Aside from that, the model diagnostics look okay.

https://www.kripfganz.de/stata/
Comment
Malika Ennajeh

Join Date: Sep 2022

Posts: 6
#9

17 Sep 2022, 12:33

Thanks for your response Prof Sebastian. I wonder whether the number of lags should be equal on both sets of instruments ? Also, I should have started with l.d in the gmm() is that what you mean ?
Comment
Malika Ennajeh

Join Date: Sep 2022

Posts: 6
#10

20 Sep 2022, 06:58

Dear Prof Sebastian,

Could you please assess the validity of the instruments used as well as the syntax ? I have set the lag limit for the two sets of instrument to 3.

xtabond2 l(0/1).roa dso curr_rat debt_rat firm_size wcreq grwce yr2015-yr20
> 20, gmm(roa, lag(1 3) eq(diff)) gmm(ccc curr_rat grwce, lag(1 3) collapse
> equation(diff)) gmm(l.(d.(wcreq grwce curr_rat dio debt_rat ccc)), collapse
> equation(level)) iv(yr2015-yr2020, eq(level))twostep nodiffsargan robust sma
> ll
Favoring space over speed. To switch, type or click on mata: mata set matafavor
> speed, perm.
yr2015 dropped due to collinearity
yr2019 dropped due to collinearity
Warning: Two-step estimated covariance matrix of moments is singular.
Using a generalized inverse to calculate optimal weighting matrix for two-ste
> p estimation.

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: id Number of obs = 500
Time variable : t Number of groups = 100
Number of instruments = 37 Obs per group: min = 5
F(11, 99) = 41.17 avg = 5.00
Prob > F = 0.000 max = 5
------------------------------------------------------------------------------
| Corrected
roa | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
roa |
L1. | -.2585195 .1341465 -1.93 0.057 -.5246952 .0076562
|
dso | .0251723 .0099595 2.53 0.013 .0054105 .0449341
curr_rat | -.1265664 .0861503 -1.47 0.145 -.2975074 .0443745
debt_rat | -.6848794 .5761135 -1.19 0.237 -1.828014 .4582547
firm_size | -.9238737 .5457658 -1.69 0.094 -2.006791 .1590441
wcreq | 2.263362 .7989603 2.83 0.006 .6780515 3.848673
grwce | -.36334 .5643155 -0.64 0.521 -1.483064 .7563845
yr2016 | -.3969509 .4044627 -0.98 0.329 -1.199493 .405591
yr2017 | -.2364141 .4917754 -0.48 0.632 -1.212203 .7393749
yr2018 | .135768 .4597839 0.30 0.768 -.776543 1.048079
yr2020 | -.4557374 .6312617 -0.72 0.472 -1.708297 .7968227
_cons | 14.83068 10.97284 1.35 0.180 -6.941815 36.60318
------------------------------------------------------------------------------
Instruments for first differences equation
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(1/3).(ccc curr_rat grwce) collapsed
L(1/3).roa
Instruments for levels equation
Standard
yr2015 yr2016 yr2017 yr2018 yr2019 yr2020
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
DL(1/4).(LD.wcreq LD.grwce LD.curr_rat LD.dio LD.debt_rat LD.ccc)
collapsed
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = 0.80 Pr > z = 0.421
Arellano-Bond test for AR(2) in first differences: z = -1.21 Pr > z = 0.225
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(25) = 578.00 Prob > chi2 = 0.000
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(25) = 26.05 Prob > chi2 = 0.405
(Robust, but weakened by many instruments.)

Your feedback is much needed.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2555
#11

20 Sep 2022, 09:13

You would normally still not include the time-series operators LD in your last gmm() option for the level model because xtabond2 is doing that automatically. If you prefer a command with syntax following the what-you-type-is-what-you-get approach, I can (a bit selfishly) recommend my xtdpdgmm command; see the presentation slides I referenced in an ealier post above. As you are using instruments for the level model, I also recommend to have a look at the difference-in-Hansen test statistics.

https://www.kripfganz.de/stata/
Comment

Announcement

Two-Step System GMM with xtabond2 command

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment