Xtabond2 for system GMM. Please help me for coding

Annur Wijayakusuma

Join Date: Mar 2020
Posts: 31

#46

09 Oct 2020, 06:09

Hi Sebastian,

I need to confirm with you. Please correct me if I am wrong.
When I need to make sure that my instruments variable is strictly exogenous, I need to check difference-in Hansen Test. But when I want to check whether my instrument variable is over, I can check Hansen test of overidentification. If this is true, it means your concern above is related to difference- in the Hansen test. I need your confirmation about this.

Below, I use equation (level) for IV and here is the code:

Code:

xtabond2 ROATotAsset L.RDI L.ROATotAsset Leverage FirmsSIZE Y2007 Y2009-Y2017, gmm(RDI, lag(2 4) equation(diff)) gmm(ROATotAsset, lag(1 1) equation(level)) iv(Leverage FirmsSIZE Y2007 Y2009-Y2017, equation(level)) small twostep robust

the result is below

Code:

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: ASX_ID                          Number of obs      =      5349
Time variable : Year                            Number of groups   =       487
Number of instruments = 50                      Obs per group: min =         7
F(14, 486)    =      6.85                                      avg =     10.98
Prob > F      =     0.000                                      max =        11
------------------------------------------------------------------------------
             |              Corrected
 ROATotAsset |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         RDI |
         L1. |   .0002143   .0002117     1.01   0.312    -.0002016    .0006302
             |
 ROATotAsset |
         L1. |   .3185845   .1487461     2.14   0.033     .0263197    .6108492
             |
    Leverage |  -.0040793   .0016696    -2.44   0.015    -.0073598   -.0007989
   FirmsSIZE |   .0166592   .0090541     1.84   0.066    -.0011308    .0344492
       Y2007 |   .0513919   .0396925     1.29   0.196    -.0265982     .129382
       Y2009 |  -.0796341   .0400902    -1.99   0.048    -.1584057   -.0008625
       Y2010 |  -.0067393   .0285693    -0.24   0.814    -.0628738    .0493952
       Y2011 |   .0256532   .0241647     1.06   0.289     -.021827    .0731334
       Y2012 |  -.0054812   .0228277    -0.24   0.810    -.0503343     .039372
       Y2013 |   -.056461   .0311257    -1.81   0.070    -.1176186    .0046965
       Y2014 |  -.1078993   .0348192    -3.10   0.002     -.176314   -.0394846
       Y2015 |  -.2262245    .104225    -2.17   0.030    -.4310118   -.0214372
       Y2016 |  -.1658626   .1111863    -1.49   0.136    -.3843278    .0526025
       Y2017 |  -.0825125    .132348    -0.62   0.533    -.3425573    .1775324
       _cons |  -.1751213   .0624856    -2.80   0.005    -.2978967    -.052346
------------------------------------------------------------------------------
Instruments for first differences equation
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(2/4).RDI
Instruments for levels equation
  Standard
    Leverage FirmsSIZE Y2007 Y2009 Y2010 Y2011 Y2012 Y2013 Y2014 Y2015 Y2016
    Y2017
    _cons
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    DL.ROATotAsset
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -2.71  Pr > z =  0.007
Arellano-Bond test for AR(2) in first differences: z =  -0.40  Pr > z =  0.687
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(35)   = 281.42  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(35)   =  39.04  Prob > chi2 =  0.293
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  GMM instruments for levels
    Hansen test excluding group:     chi2(25)   =  23.94  Prob > chi2 =  0.523
    Difference (null H = exogenous): chi2(10)   =  15.10  Prob > chi2 =  0.129
  gmm(RDI, eq(diff) lag(2 4))
    Hansen test excluding group:     chi2(8)    =   7.82  Prob > chi2 =  0.451
    Difference (null H = exogenous): chi2(27)   =  31.22  Prob > chi2 =  0.262
  gmm(ROATotAsset, eq(level) lag(1 1))
    Hansen test excluding group:     chi2(25)   =  23.94  Prob > chi2 =  0.523
    Difference (null H = exogenous): chi2(10)   =  15.10  Prob > chi2 =  0.129
  iv(Leverage FirmsSIZE Y2007 Y2009 Y2010 Y2011 Y2012 Y2013 Y2014 Y2015 Y2016 Y2017, eq(level))
    Hansen test excluding group:     chi2(23)   =  21.12  Prob > chi2 =  0.574
    Difference (null H = exogenous): chi2(12)   =  17.92  Prob > chi2 =  0.118


.

I want the instrument variable is on the level not the growth or difference

Appreciate your help

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#47

09 Oct 2020, 06:57

These results already look better. Loosely speaking, both the Hansen test and the Difference-in-Hansen test are test for the validity of the (extra) instruments. The Hansen test checks the validity of all instruments (maintaining the assumption that there are at least as many valid instruments available to identify all coefficients). The Difference-in-Hansen tests check a particular subset of the instruments.

https://www.kripfganz.de/stata/
Comment
Annur Wijayakusuma

Join Date: Mar 2020

Posts: 31
#48

09 Oct 2020, 07:35

Hi Sebastian,

Thank you very much for your explanation. It helps me a lot in the middle of my confusion about searching and understanding the Hansen test.
Comment
Abu Hamza

Join Date: Jan 2021

Posts: 1
#49

04 Jan 2021, 06:15

Originally posted by Eric de Souza View Post

This may help you start:
Stata comes with an built-in command called xtabond for dynamic panel data modelling. The command that we shall use has been developed by David Roodman of the Center for Global Development. It is called xtabond2 which can be downloaded from withing Stata with the command ssc install xtabond2.
1. Before using xtabond2 do not forget to xtset your data:
xtset panelid timeseriesid
where panelid is the variable identifying the “individual” and timeseriesid is the variable identifying the date.
2. xtabond2 first requires the name of the dependent variable followed by the list of explanatory variables. For instance (the data comes from:abdata.dta (-webuse abdata.dta-)
xtabond2 n L.n L2.n w L.w L(0/2).(k ys) yr*
n is the dependent variable (firm’s employment in log)
the other variables are the explanatory variables (the constant term is assumed)
L.n is n lagged once; L2.n is n lagged twice; w is firm’s wage level in log; L.w is w lagged once;
L(0/2) is short for lagged zero times, once and twice; the variables concerned follow in parentheses after a full-stop. Consequently, L(0/2).(k ys) stands for k, L.k, L2.k, y, L.y and L2.y.
yr* stands for all the variables in the dataset begining with yr.
After the names of the dependent variable and the explanatory variable follows a comma. After the comma follow the options.
3. It is almost always useful to put robust and small.
xtabond2 n L.n L2.n w L.w L(0/2).(k ys) yr*, robust small
4. The default is the one-step GMM estimator. If one wants a two-step GMM estimator, add twostep.
5. The default is the the Blundell-Bond system estimator which adds moment conditions on the levels. If one wants only the difference estimator, add noleveleq.
6. Now comes the list of instruments, both IV style instruments (the same variable for all the observations) and GMM style instruments (different variables for different observations). Note that there must be at least as many instruments as there are explanatory variables. Exogenous variables are their own instruments and must be listed here, either under ivstyle (strictly exogenous) or under gmmstyle (predetermined or endogenous).
6a. gmmstyle
For example, gmmstyle(L.(n w k)) states that all lagged values of L.n, L.w and L.k are to be treated as GMM instruments. To be clear, this means for n that L2.n, L3.n, and so on are the GMM instruments. Similarly, for w and for k.
For example, gmmstyle(L2.n L.(w k)) states that all lagged values of L2.n, L.w and L.k are to be treated as GMM instruments. Thus, for n, L3.n, L4.n, and so on are the GMM instruments, and for w and k, from the second lag on.
6b. ivstyle
For example, ivstyle(L(0/2).ys yr*) states that ys, L.ys, L2.ys and all the yr variables are to be considered as exogenous instruments.

Putting everything together we get:
xtabond2 n L(1/2).n L(0/1).w L(0/2).(k ys) yr*, gmmstyle(L2.n L.(w k)) ivstyle(L(0/2).ys yr*) robust small

7. We can, and mostly should, impose limits on the lags used as instruments in order to reduce the number of instruments.
xtabond2 n L(1/2).n L(0/1).w L(0/2).(k ys) yr*,
gmmstyle(L2.n L.(w k), laglimits(1 3)) ivstyle(L(0/2).ys yr*)
robust small.
The above command says to take as GMM instruments, L2.n, L.w and L.k lagged once, twice and three times.
Note that gmmstyle(L2.n) is equivalent to gmmstyle(n, laglimits(3 .), the dot indicating to go backwards till the start of the sample

Dear all,

Please forgive me for jumping in on this thread, but I saw this post and had a few questions that I wanted to clarify. Reproducing the xtabond2 command below:

xtabond2 n L(1/2).n L(0/1).w L(0/2).(k ys) yr*,
gmmstyle(L2.n L.(w k), laglimits(1 3)) ivstyle(L(0/2).ys yr*)
robust small.

I'm a complete novice at this, but according to my understanding, this command tells stata to run the regression of 'n' on its first and second lags, the level and first lag of 'w', and the level, first and second lags of 'k' and 'ys', in addition to time dummies. However, the gmmstyle and ivstyle brackets that follow do not include any of the following terms:

L.n
w
k
L2.k

My question is: is this an inadvertent omission, or am I missing something here? Secondly, if I run the following regression :

xtabond2 n L.n w k y yr*,
gmmstyle (L.n w k, laglimits(1 3)) ivstyle(yr*)
robust small.

But choose to not include one or more of the control variables in either the gmmstyle or ivstyle brackets after the comma (I have not included y in the above example in any of the brackets), would this be a mistake? Following from this, is it necessary to include all RHS variables in either of the two brackets? Even if I don't include all variables, but choose to omit some of them, stata still runs my regression for me and gives me (sometimes valid) results. Why is that so? I'm asking this because I have been trying to run some GMM regressions, but my AR(2) diagnostic check is not satisfactory when all variables are added in either the gmmstyle or ivstyle brackets. The diagnostic check only becomes acceptable when one or two of the variables are omitted.

Your help would be highly appreciated.

Best regards,
Comment
Mahinda Sene

Join Date: Oct 2020

Posts: 9
#50

08 Dec 2021, 15:15

Originally posted by Celine Tran View Post

Dear Eric, Sebastian and Roman,

First of all, thank you very much for your reply. You provide me a deep-understanding of using xtabond2. I try to write the code, but honestly, I am not self-confident. Back to my question, I need to code for this:

lag2 and lag3 of the levels of firm performance variable (lnTobin), corporate governance variables (female, nonexe, dual, lnsize) and control variables (fsize lev) are employed as GMM-type instrumental variables for the first-differenced equation. Meanwhile, first lagged differences of firm performance, corporate governance, and control variables are used as GMM-type instruments for the levels equation. Firm age (lnage) anf year dummies are exogenously determined

My code is: xtabond2 lnTobin L.lnTobin female nonexe dual lnsize fsize lev lnage i.year i.firmid, gmm(L.lnTobin,lag(1 2)) gmm(female nonexe dual lnsize fsize lev,lag(1 1)) iv(lnage i.year ) small two

I use i.year and i.firmid to control firm-fixed effect and year-fixed effect. So, from your point of view, my code is reasonable or not?

Also, I have 1355 observations and the number of instruments are 247. Is it too many instruments?

Moreover, choosing lag2 and lag3 as I am doing depends on each research, right? I mean that I can choose any levels of lag as long as system GMM gives me the good results? I am sorry if this question is stupid, but the more I read, the more confused I feel

In addition, do all of my explanatory variables have to be listed under gmm() or iv(). Because apart from variables in my code, I want to add two variables: after (dummy variable takes value of 1 if after crisis) and after*female (interaction variable) to my original equation in order to examine the effect of crisis for robustness check.

I look forward to hearing from you.

Regards.
Celine.

You have a mistake in your stata command

your code is
xtabond2 lnTobin L.lnTobin female nonexe dual lnsize fsize lev lnage i.year i.firmid, gmm(L.lnTobin,lag(1 2)) gmm(female nonexe dual lnsize fsize lev,lag(1 1)) iv(lnage i.year ) small two

I will correct this as

xtabond2 lnTobin L.lnTobin female nonexe dual lnsize fsize lev lnage year*, gmm(L.lnTobin) iv(female nonexe dual lnsize fsize lev lnage year*, equation(level)) nodiffsargan twostep robust orthogonal small

please delete second gmm and include those exogenous variables under iv
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#51

10 Dec 2021, 12:21

Originally posted by Mahinda Sene View Post

You have a mistake in your stata command

your code is
xtabond2 lnTobin L.lnTobin female nonexe dual lnsize fsize lev lnage i.year i.firmid, gmm(L.lnTobin,lag(1 2)) gmm(female nonexe dual lnsize fsize lev,lag(1 1)) iv(lnage i.year ) small two

I will correct this as

xtabond2 lnTobin L.lnTobin female nonexe dual lnsize fsize lev lnage year*, gmm(L.lnTobin) iv(female nonexe dual lnsize fsize lev lnage year*, equation(level)) nodiffsargan twostep robust orthogonal small

please delete second gmm and include those exogenous variables under iv

You are responding to a query that is 3 years old. Moreover, I disagree with your assessment. It was not stated in the original query that those variables are strictly exogenous. Even more problematic, with your correction you are imposing the condition that all of the variables in the iv() option are uncorrelated with the unobserved "fixed effects", which may not be justified.

https://www.kripfganz.de/stata/
Comment
Mohammad Al-Tamimi

Join Date: Jun 2022

Posts: 3
#52

19 Jul 2022, 21:47

Originally posted by sebastian kripfganz View Post

why do you specify the if year1!=0 condition?

You need to specify the industry dummies in the list of instruments as well.

There is still perfect collinearity among the dummy variables. Please provide the stata output of the following command after the estimation:

Code:

sum year if e(sample)

thx
Comment
Nariman Sayed

Join Date: Oct 2022

Posts: 29
#53

18 Jan 2023, 01:27

Hi Sebastian, can you help me in this thread pls:

https://www.statalist.org/forums/for...gmm-estimation

much appreciated !!
Comment
Mrisho Rajabu Mrisho

Join Date: Feb 2023

Posts: 4
#54

01 Aug 2023, 01:51

Hello there

I am estimating the effects of Islamic banks and convectıonal banks on underground economy within the OIC natıons

I then set up dummıes of Islamic banks to be islamicoic and dummies for non ıslamic banks to be nonislamicoic

code :
xtabond2 se L.se ATM CBBRNCH10K DEP1KSWTCHCB BORRWERZ1KCB DOMCREDPRVTSECGDP CAPTOASSETRATIO fxreal Taxes gdppercapitagrowthannualnygdppca, robust nomata iv(L2.ATM L2.CBBRNCH10K L2.DEP1KSWTCHCB L2.BORRWERZ1KCB L2.DOMCREDPRVTSECGDP L2.CAPTOASSETRATIO L2.fxreal L2.Taxes L2.gdppercapitagrowthannualnygdppca ) gmm(L.se l.islamicoıc,collapse)

these are the results ı got are they correct

xtabond2 se L.se ATM CBBRNCH10K DEP1KSWTCHCB BORRWERZ1KCB DOMCREDPRVTSECGDP CAPTOASSETRATIO fxreal Taxes gdppercapit
> agrowthannualnygdppca, robust nomata iv(L2.ATM L2.CBBRNCH10K L2.DEP1KSWTCHCB L2.BORRWERZ1KCB L2.DOMCREDPRVTSECGDP L2
> .CAPTOASSETRATIO L2.fxreal L2.Taxes L2.gdppercapitagrowthannualnygdppca ) gmm(L.se l.islamicoıc,collapse)
Building GMM instruments...
1 instrument(s) dropped because of collinearity.
Estimating.
Warning: Two-step estimated covariance matrix of moment conditions is singular.
Number of instruments may be large relative to number of groups.
Using a generalized inverse to calculate robust weighting matrix for Hansen test.
Performing specification tests.

Dynamic panel-data estimation, one-step system GMM

Group variable: idc Number of obs = 735
Time variable : year Number of groups = 49
Number of instruments = 41 Obs per group: min = 15
Wald chi2(9) = 3697.57 avg = 15.00
Prob > chi2 = 0.000 max = 15

Robust
se Coef. Std. Err. z P>z [95% Conf. Interval]

se
L1. .9678087 .0332414 29.11 0.000 .9026567 1.032961

ATM -.0002712 .0008431 -0.32 0.748 -.0019237 .0013814
CBBRNCH10K .0001454 .0006709 0.22 0.828 -.0011695 .0014603
DEP1KSWTCHCB .0007113 .0007096 1.00 0.316 -.0006795 .0021021
BORRWERZ1KCB .002055 .0007807 2.63 0.008 .0005248 .0035852
DOMCREDPRVTSECGDP .0004649 .0005548 0.84 0.402 -.0006224 .0015523
CAPTOASSETRATIO -.0030898 .0013637 -2.27 0.023 -.0057626 -.0004171
fxreal .0030857 .0014562 2.12 0.034 .0002316 .0059397
Taxes -.0007782 .001002 -0.78 0.437 -.0027421 .0011857
gdppercapitagrowthannualnygdppca -.0890062 .0333268 -2.67 0.008 -.1543256 -.0236869
_cons .7507978 1.271021 0.59 0.555 -1.740357 3.241953

Instruments for first differences equation
Standard
D.(L2.ATM L2.CBBRNCH10K L2.DEP1KSWTCHCB L2.BORRWERZ1KCB
L2.DOMCREDPRVTSECGDP L2.CAPTOASSETRATIO L2.fxreal L2.Taxes
L2.gdppercapitagrowthannualnygdppca)
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(1/.).(L.se L.islamicoıc) collapsed
Instruments for levels equation
Standard
_cons
L2.ATM L2.CBBRNCH10K L2.DEP1KSWTCHCB L2.BORRWERZ1KCB L2.DOMCREDPRVTSECGDP
L2.CAPTOASSETRATIO L2.fxreal L2.Taxes L2.gdppercapitagrowthannualnygdppca
GMM-type (missing=0, separate instruments for each period unless collapsed)
D.(L.se L.islamicoıc) collapsed

Arellano-Bond test for AR(1) in first differences: z = -3.68 Pr > z = 0.000
Arellano-Bond test for AR(2) in first differences: z = -0.90 Pr > z = 0.366

Sargan test of overid. restrictions: chi2(30) = 40.85 Prob > chi2 = 0.089
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(30) = 31.85 Prob > chi2 = 0.375
(Robust, but weakened by many instruments.)
Comment
Sougata Mondal

Join Date: Dec 2024

Posts: 5
#55

23 Dec 2024, 23:54

Experts kindly check whether my command is written correctly or not. I'm getting a favorable result with the following command:

xtabond2 roce_winsor l.roce_winsor l.esg_winsor debttoequity_winsor logofage_winsor logofta_winsor ind y*, gmm(l.roce_winsor l.esg_winsor debttoequity_winsor logofage_winsor logofta_winsor ind y*, lag(1 4)) iv(l.esg_winsor debttoequity_winsor logofage_winsor logofta_winsor ind y*) small robust twostep artest(3)

Here, my Dep Var: roce_winsor
& my Ind Var: l.esg_winsor
Please help.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment