XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

Prateek Bedi

Join Date: Sep 2018

Posts: 199
#181

08 May 2020, 05:37

Alright, Prof. Kripfganz. Thanks for this clarification. I think I should rely on theoretical arguments to take a call regarding categorisation of variables. Nonetheless, it would be helpful if you can please mention the slide numbers which I should look at in your 2019 London Stata Conference presentation.
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#182

12 May 2020, 03:46

Hi,

Using xtdpdgmm, I am examining the influence of macroeconomic variables on the speed of adjustment of corporate cash holdings for an unbalanced panel dataset of 1696 firms over the period 2001-16. In order to capture the effect of a macroeconomic variable (say GDP) on speed of adjustment of corporate cash holdings, I use an interaction term concerning GDP and lagged cash holdings. The model is given below.

Code:

xtdpdgmm CashHoldings1 L.CashHoldings1 Size1 Leverage1 Liquidity1 GrowthPotential2 c.GDPGrowthRatein#c.L.CashHoldings1, teffects twostep vce(cluster CompanyID) gmmiv(L.CashHoldings1, lag(0 6) model(fodev)) gmmiv(c.GDPGrowthRatein#c.L.CashHoldings1, lag(1 1) model(fodev)) gmmiv(Leverage1, lag(1 1) model(fodev)) gmmiv(Liquidity1, lag(1 1) model(fodev)) gmmiv(GrowthPotential2, lag(1 1) model(fodev)) gmmiv(Size1, lag(1 1) coll model(fodev)) nofootnote

As per my understanding, when we introduce an interaction term between X1 and X2 in a regression model, it is pertinent to include X1 and X2 individually as explanatory variables in order to avoid model misspecification and resultant bias in estimates. However, in my model, one of the variables involved in the interaction is GDP i.e. a cross-invariant variable. Since I include time dummies in my model to control for the effect of such macroeconomic variables, I am confused whether I should explicitly include GDP as an explanatory variable in my model. If I do include GDP, omitted variable bias may affect my estimates because GDP is indeed potentially correlated with other macroeconomic variables (which I do not include in my model). On the other hand, if I do not include GDP, I am sceptical about the validity of the coefficient of interaction term.

Thanks!
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#183

13 May 2020, 10:55

Originally posted by Prateek Bedi View Post

I think I should rely on theoretical arguments to take a call regarding categorisation of variables. Nonetheless, it would be helpful if you can please mention the slide numbers which I should look at in your 2019 London Stata Conference presentation.

Slides 90 and following.

Originally posted by Prateek Bedi View Post

As per my understanding, when we introduce an interaction term between X1 and X2 in a regression model, it is pertinent to include X1 and X2 individually as explanatory variables in order to avoid model misspecification and resultant bias in estimates. However, in my model, one of the variables involved in the interaction is GDP i.e. a cross-invariant variable. Since I include time dummies in my model to control for the effect of such macroeconomic variables, I am confused whether I should explicitly include GDP as an explanatory variable in my model. If I do include GDP, omitted variable bias may affect my estimates because GDP is indeed potentially correlated with other macroeconomic variables (which I do not include in my model). On the other hand, if I do not include GDP, I am sceptical about the validity of the coefficient of interaction term.

As you correctly observed, you cannot include GDP as a regressor because it would be perfectly collinear with the time dummies. The effect of GDP is implicitly included in the time effects, thus there is no concern about the validity of the interaction term. However, you can not identify the marginal effect of GDP in this model. You can only identify the effect that GDP has on the effect of the other variable in the interaction term.

https://www.kripfganz.de/stata/
1 like
Comment
Prateek Bedi

Join Date: Sep 2018

Posts: 199
#184

14 May 2020, 11:52

Thanks a lot, Prof. Kripfganz for your response. Really appreciate.
Comment
Nishant Kathuria

Join Date: Sep 2019

Posts: 19
#185

16 May 2020, 01:06

Thank you so much Proff. Kripfganz for helping us understand GMM and more.. I take liberty in asking some questions that hopefully, will mitigate autocorrelation problem in my models.
Let y be the dependent variable, and I am interested in finding the relation between y and the interaction between x1 and x2 variables(inter_x1_x2). All the other variables (x3 onwards) are very standard control variables used in the literature. The data is from 1992-2015. The variable y is corporate social performance of firm, and I expect it to be correlated with prior social performance. _y* denotes all the year dummies.

1.xtabond2 L(0/2).y c.L.x1##c.L.x2 L.(x3 x4 x5 x6 x7 x8 x9) _y*, gmmstyle(y, lag(3 .) collapse) gmmstyle(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8 x9, lag(2 .) collapse) ivstyle (_y* , equation(level) ) nodif twostep
Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: idDummy Number of obs = 18682
Time variable : year Number of groups = 2751
Number of instruments = 250 Obs per group: min = 1
Wald chi2(37) = 27722.08 avg = 6.79
Prob > chi2 = 0.000 max = 20
------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
y |
L1. | .5661621 .0122728 46.13 0.000 .5421078 .5902164
L2. | .1574914 .0131169 12.01 0.000 .1317828 .1831999
|
x1 |
L1. | -.0194 .0085448 -2.27 0.023 -.0361475 -.0026526
|
x2 |
L1. | .0040796 .0159145 0.26 0.798 -.0271123 .0352715
|
cL.x1#cL.x2 | .0053554 .0037109 1.44 0.149 -.0019179 .0126286

Instruments for first differences equation
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(2/21).(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8 x9) collapsed
L(3/21).y collapsed
Instruments for levels equation
Standard
_year1 _year2 _year3 _year4 _year5 _year6 _year7 _year8 _year9 _year10
_year11 _year12 _year13 _year14 _year15 _year16 _year17 _year18 _year19
_year20 _year21 _year22 _year23 _year24 _year25
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
DL.(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8 x9) collapsed
DL2.y collapsed
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -15.98 Pr > z = 0.000
Arellano-Bond test for AR(2) in first differences: z = -11.22 Pr > z = 0.000
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(212) =2036.97 Prob > chi2 = 0.000
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(212) = 410.63 Prob > chi2 = 0.000
(Robust, but weakened by many instruments.)

The AR(2) test fails to confirm that there is no second order correlation, the Hansen test of overidentification fails too.

2.xtdpdgmm L(0/2).y c.L.x1##c.L.x2 L.(x3 x4 x5 x6 x7 x8 x9) _y* , gmm(y, lag(3 .) collapse) gmm(x1 x2 x3 x4 x5 x6 x7 x8 x9 , lag(2 .) collapse) iv(_y*) twostep vce(robust)

Group variable: idDummy Number of obs = 18682
Time variable: year Number of groups = 2751

Moment conditions: linear = 219 Obs per group: min = 1
nonlinear = 0 avg = 6.790985
total = 219 max = 20

(Std. Err. adjusted for 2,751 clusters in idDummy)
------------------------------------------------------------------------------
| WC-Robust
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
y |
L1. | .7921867 .039236 20.19 0.000 .7152856 .8690879
L2. | .1314568 .0368142 3.57 0.000 .0593024 .2036112
|
x1 |
L1. | -.0079019 .0565286 -0.14 0.889 -.118696 .1028922
|
x2 |
L1. | .0968249 .0173995 5.56 0.000 .0627226 .1309272
|
cL.x1#cL.x2 | .0423953 .0448359 0.95 0.344 -.0454814 .130272

estat serial, ar(1/3)

Arellano-Bond test for autocorrelation of the first-differenced residuals
H0: no autocorrelation of order 1: z = -11.3669 Prob > |z| = 0.0000
H0: no autocorrelation of order 2: z = -8.6879 Prob > |z| = 0.0000
H0: no autocorrelation of order 3: z = 7.8412 Prob > |z| = 0.0000

. estat overid

Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid

2-step moment functions, 2-step weighting matrix chi2(187) = 330.4328
Prob > chi2 = 0.0000

2-step moment functions, 3-step weighting matrix chi2(187) = 321.9168
Prob > chi2 = 0.0000

3. When I omit x9, which is firm’s age, then:

3.xtabond2 L(0/3).y c.L.x1##c.L.x2 L.(x3 x4 x5 x6 x7 x8 ) _y*, gmmstyle(y, lag(4 .) collapse) gmmstyle(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8 , lag(2 .) collapse) ivstyle(_y* , equation(level) ) nodif twostep

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: idDummy Number of obs = 15839
Time variable : year Number of groups = 2411
Number of instruments = 227 Obs per group: min = 1
Wald chi2(37) = 10639.70 avg = 6.57
Prob > chi2 = 0.000 max = 19
------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
y |
L1. | .6727764 .0169075 39.79 0.000 .6396383 .7059144
L2. | -.4911847 .0380864 -12.90 0.000 -.5658326 -.4165368
L3. | .687692 .0382205 17.99 0.000 .6127812 .7626028
|
x1 |
L1. | -.0163344 .0102129 -1.60 0.110 -.0363513 .0036826
|
x2 |
L1. | .0207565 .0182278 1.14 0.255 -.0149693 .0564822
|
cL.x1#cL.x2 | .004833 .0042823 1.13 0.259 -.0035602 .0132262

Instruments for first differences equation
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(2/21).(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8) collapsed
L(4/21).y collapsed
Instruments for levels equation
Standard
_year1 _year2 _year3 _year4 _year5 _year6 _year7 _year8 _year9 _year10
_year11 _year12 _year13 _year14 _year15 _year16 _year17 _year18 _year19
_year20 _year21 _year22 _year23 _year24 _year25
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
DL.(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8) collapsed
DL3.y collapsed
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -16.76 Pr > z = 0.000
Arellano-Bond test for AR(2) in first differences: z = 6.11 Pr > z = 0.000
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(189) = 916.54 Prob > chi2 = 0.000
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(189) = 300.72 Prob > chi2 = 0.000
(Robust, but weakened by many instruments.)

4. xtdpdgmm L(0/3).y c.L.x1##c.L.x2 L.(x3 x4 x5 x6 x7 x8 ) _y* , gmm(y, lag(4 .) collapse) gmm(x1 x2 x3 x4 x5 x6 x7 x8 , lag(2 .) collapse) iv(_y*) twostep vce(robust)

Generalized method of moments estimation

Fitting full model:
Step 1 f(b) = .54287407
Step 2 f(b) = .11860463

Group variable: idDummy Number of obs = 15839
Time variable: year Number of groups = 2411

Moment conditions: linear = 197 Obs per group: min = 1
nonlinear = 0 avg = 6.569473
total = 197 max = 19

(Std. Err. adjusted for 2,411 clusters in idDummy)
------------------------------------------------------------------------------
| WC-Robust
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
y |
L1. | .8571209 .051009 16.80 0.000 .757145 .9570967
L2. | -.4125077 .0882957 -4.67 0.000 -.5855641 -.2394513
L3. | .4672897 .082242 5.68 0.000 .3060983 .6284811
|
x1 |
L1. | -.0745491 .0457311 -1.63 0.103 -.1641805 .0150823
|
x2 |
L1. | .117317 .0207835 5.64 0.000 .0765821 .158052
|
cL.x1#cL.x2 | .0749956 .0477307 1.57 0.116 -.0185548 .168546

estat serial, ar(1/3)

Arellano-Bond test for autocorrelation of the first-differenced residuals
H0: no autocorrelation of order 1: z = -10.3522 Prob > |z| = 0.0000
H0: no autocorrelation of order 2: z = 1.4967 Prob > |z| = 0.1345
H0: no autocorrelation of order 3: z = -0.7142 Prob > |z| = 0.4751

estat overid

Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid

2-step moment functions, 2-step weighting matrix chi2(166) = 285.9558
Prob > chi2 = 0.0000

2-step moment functions, 3-step weighting matrix chi2(166) = 291.3354
Prob > chi2 = 0.0000

The Hansen test still remains invalid.

I tried lags from 2 to 10 for the dependent variable, with no success. It would be very helpful if you could guide me how I can identify the conditions that help me build acceptable models. I am unable to identify what am I missing here.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#186

16 May 2020, 10:14

You have a large number of observations, which is usually a good thing. However, with such many observations, the specification tests can already detect relatively small deviations from the null hypotheses.

One kind of model misspecification could be that your regressors enter the equation all lagged. If there are contemporaneous effects, omitting them could lead to serial correlation in the error term and invalidity of the instruments. Alternatively, adding more lags of the X-regressors (or further interaction terms) might help as well, not just adding lags of the dependent variable.

There is no simple solution that always works. For a general approach to model selection, please have a look at my 2019 London Stata Conference presentation, slides 90 onwards:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference,

and the paper by Jan Kiviet referenced therein.

https://www.kripfganz.de/stata/
1 like
Comment
Nishant Kathuria

Join Date: Sep 2019

Posts: 19
#187

16 May 2020, 21:20

Thank you so much for replying. I included all the variables as lagged because I am looking at the effects of variables in one year on the decision making in firms in the following year.
I will look into testing more models and may need your advise further. Thank you again
Comment

Rita Juliana

Join Date: May 2020
Posts: 3

#188

03 Jun 2020, 09:16

Dear all,

Is it possible to use xtdpdgmm if my data consist of large N (around 2000) and T(around 100). and is it possible to include teffects or time dummy in the syntax?
i try to run this

Code:

xtdpdgmm L(0/1).newinv L.(sz lev q return cash) age, model(fodev) collapse gmm(newinv, lag(1 1) collapse) gmm(cash sz lev q return, lag(2 2) collapse diff m(l)) iv(age) teffects two nl(noserial) vce(r)

but the result is like this

Code:

. xtdpdgmm L(0/1).newinv L.(sz lev q return cash) age, model(fodev) collapse gmm(newinv, lag(1 1) collapse) gmm(cash sz lev q return, l
> ag(2 2) collapse diff m(l)) iv(age) teffects two nl(noserial) vce(r)

Generalized method of moments estimation

Fitting full model:

Step 1:
initial:       f(b) =  .03516697
alternative:   f(b) =  12.859461
rescale:       f(b) =  .00848645
Iteration 0:   f(b) =  .00848645  
Iteration 1:   f(b) =  1.285e-08  
Iteration 2:   f(b) =  1.284e-08  
     xtdpdgmm_opt::Xnl():  3900  unable to allocate real <tmp>[114099,113]
      xtdpdgmm_opt::GS():     -  function returned error
    xtdpdgmm_opt::Hinv():     -  function returned error
       xtdpdgmm_opt::V():     -  function returned error
              xtdpdgmm():     -  function returned error
                 <istmt>:     -  function returned error
r(3900);

please help

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#189

03 Jun 2020, 11:28

There appears to be a problem with insufficient memory space on your computer. This is not surprising given the dimensions of your data set. The xtdpdgmm command internally creates some matrices that will become very large with that many time periods. It is not optimized for such data sets as it is designed for estimations with relatively few time periods. Adding time effects is not a good idea in your case because this will create a huge number of extra coefficients to be estimated.

https://www.kripfganz.de/stata/
Comment
Rita Juliana

Join Date: May 2020

Posts: 3
#190

03 Jun 2020, 21:56

Thank you for replying Prof. Kripganz,

Originally posted by Sebastian Kripfganz View Post

There appears to be a problem with insufficient memory space on your computer. This is not surprising given the dimensions of your data set. The xtdpdgmm command internally creates some matrices that will become very large with that many time periods. It is not optimized for such data sets as it is designed for estimations with relatively few time periods. Adding time effects is not a good idea in your case because this will create a huge number of extra coefficients to be estimated.

in this case, how much memory space will be sufficient?

also, according to roodman (2009), in his conclusion he suggests to include time dummies, i'm afraid that my result will be bias if i don't include time dummies, what do think?

Include time dummies. The autocorrelation test and the robust estimates of the coefficient standard errors assume no correlation across individuals in the idiosyncratic disturbances. Time dummies make this assumption more likely to hold.

another thing, roodman (2009) also mentioned this:

Apply the estimators to “small T, large N” panels. If T is large, dynamic panel bias becomes insignificant, and a more straightforward fixed-effects estimatorworks. Meanwhile, the number of instruments in difference and system GMM tends to explode with T. If N is small, the cluster–robust standard errors and the Arellano–Bond autocorrelation test may be unreliable.

in my case, is it okay to use OLS xtreg to estimate since my data have 100 T ? could you give me some suggestion? thank you so much
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#191

04 Jun 2020, 03:39

I do not have an answer to the memory question. As you quote from Roodman's paper, T should be relatively small. With small T, adding time dummies is indeed recommended and not an issue because the number of extra coefficients remains small. With such a huge T, the dynamic panel bias arising from the inclusion of the lagged dependent variable is indeed not a concern. You could thus simply use xtreg if all other variables are exogenous! If you need to treat other variables as endogenous, you would still need instrumental variables. You could possibly use xtivreg. You would not need to use a system-GMM estimator. Instruments for the first-differenced equation would be sufficient. If you still want to include a large number of time dummies, the community-contributed ivreghdfe command might be helpful as it allows to absorb these dummies without actually adding a coefficient for each of them.

Last edited by Sebastian Kripfganz; 04 Jun 2020, 03:49.

https://www.kripfganz.de/stata/
Comment
Rita Juliana

Join Date: May 2020

Posts: 3
#192

04 Jun 2020, 03:57

Thank you Prof Kripfganz for your comment and suggestion. I will try your suggestions, really appreciate your comment.
Comment
Nishant Kathuria

Join Date: Sep 2019

Posts: 19
#193

19 Jun 2020, 15:17

Hi Prof. Kripfganz,

I am writing to ask very basic questions that may seem too primitive to you and the audience:

1. If I use dependent var with 2 lags, for instance, xtdpdgmm L(0/2).y x1 x2 x3, model(diff) gmm(y, lag(a1 . )) gmm(x1 x2 x3, lag(a2. )) , do I need to start a1 with 3 and a2 with 1? My understanding is that if I use a1=1 or 2, then the y variable (because of L(0/2)) in the equation will be endogenous with the controls. Please let me know if I am correct.

2. How do I decide if I need to use model(diff) or model(fod) or mod(level)?

3. In a study with large # of firm-year observations, does the model(mdev) control for the industry effects automatically or do I need to add industry_effects* in the equation? When I use both the industry effects and model(mdev), very few industry effects appear and most of the industries are omitted.

4. I am using a four-way interaction, with one of the variables as time effect such that t=0 or t=1, to test difference in differences.
Can I use xtdpdgmm L(0/2).y c.x1##c.x2##c.x3##c.t, model(diff) gmm(y, lag(3 . )) gmm(c.x1##c.x2##c.x3##c.t, lag(1 . )) teffects instead of
xtdpdgmm L(0/2).y c.x1##c.x2##c.x3##c.t, model(diff) gmm(y, lag(3 . )) gmm(c.x1#c.x2#c.x3#c.t c.x1#c.x2#c.x3 c.x1#c.x2#c.t c.x2#c.x3#c.t c.x1#c.x3#c.t c.x1#c.x2 c.x2#c.x3 c.x1#c.x3 c.x1#c.t c.x2#c.t c.x3#c.t x1 x2 x3 , lag(1 . )) teffects ?

I look forward to your reply. Thank you so much for continued support

Nishant
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2571
#194

20 Jun 2020, 03:31

No, adding more lags of the dependent variable as regressors does not mean that you also need to start with higher lags (a1) for the instruments. The reason for starting with the second lag is that the first lag is correlated with the first-differenced error term. The second lag is uncorrelated with the first-differenced error term if the errors are serially uncorrelated. This does not depend on the number of lags of the dependent variable used as regressors. In fact, the more lags of the dependent variable you use as regressors, the more likely it is that the errors are indeed serially uncorrelated.

model(fod) has the advantage that the transformed errors are still serially uncorrelated if the untransformed errors were serially uncorrelated, while model(diff) produces first-order serial correlation in the transformed error term. As long as you make sure that your instruments are uncorrelated with the transformed error term, it should not really matter which of the two model transformations you use. However, there is one additional benefit of model(fod): If your panel data set is unbalanced with gaps, the model(diff) would lose more observations than model(fod) does. Regarding model(level), this still contains the unobserved time-invariant "fixed effects" (which are removed by the other model transformations) such that you need to take extra care to ensure that your instruments are uncorrelated with them. This can often be hard to justify. Please see my 2019 London Stata Conference presentation and the references therein for details: Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

If all firms stay in the same industry throughout the entire sample, i.e. if the industry classification remains constant over time, then model(diff), model(fod), and model(mdev) all account for these effects. In fact, they account for all time-invariant effects by removing them from the transformed model. If you still obtain estimates for some industry effects, this would mean that there must be some variation over time in the industry classification or that you have combined the model(mdev) instruments with further instruments for model(level).

It looks like the two specifications should be the same, aren't they?

https://www.kripfganz.de/stata/
1 like
Comment
Nishant Kathuria

Join Date: Sep 2019

Posts: 19
#195

23 Jun 2020, 12:41

Thank you so much for the explanation It is indeed very helpful. And for #4 above, my apologies as c.x1##c.x2##c.x3##c.t gives same results with separate interaction terms.... I realize that I was missing an interaction term, so the results were different..
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment