XTDPDGMM: new Stata command for efficient GMM estimation of linear (dynamic) panel models with nonlinear moment conditions

Zainab Mariam

Join Date: Jul 2022

Posts: 51
#511

10 Feb 2023, 05:26

Dear Professor Sebastian,

Thank you so much for your valuable response. How a great supervisor you are! I sincerely appreciate your assistance, Professor! I have the following questions, please!

1) If I have specified ‘model(diff)’ as a separate option in your xtdpdgmm command line, are the following iv() options equivalent or different?

1.1) iv(x, lag( ) diff model(level))

1.2) iv(x, lag( ) diff)

1.3) iv(x, lag( ) model(diff) diff)

1.4) iv(x, lag( ) model(level))

1.5) iv(x, lag( ))

1.6) iv(x, lag( ) model(diff))

1.7) iv(x, lag( ) level)

2) Regarding the meaning of the above iv() options, given I have specified ‘model(diff)’ as a separate option in your xtdpdgmm command line, is the following meaning of iv() options correct?

2.1) iv(x, lag( ) diff model(level)): produces differenced instruments for the level model?

2.2) iv(x, lag( ) diff): produces differenced instruments for the differenced model?

2.3) iv(x, lag( ) model(diff) diff): produces differenced instruments for the differenced model?

2.4) iv(x, lag( ) model(level)): produces level instruments for the level model?

2.5) iv(x, lag( )): produces level instruments for the differenced model?

2.6) iv(x, lag( ) model(diff)): I do not know the meaning of this iv() option, especially since 'model(diff)' has been already specified as a separate option in the command line. I think this iv() option means: produces level instruments for the differenced model? But, I do not think I can include 'model(diff)' twice in the command.

2.7) iv(x, lag( ) level): I do not know the meaning of this iv() option.

3) If I have not specified ‘model(diff)’ as a separate option in the xtdpdgmm command line, will the answers to the above questions (questions 1 and 2) be different? If so, how? Please!

4) If I have specified ‘model(FOD)’ as a separate option in the xtdpdgmm command line, will the answers to the questions above (questions 1 and 2) be different?

Thank you again for your support and effort, Professor! That made a real difference in my understanding.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#512

10 Feb 2023, 06:15

1) With global option model(diff), 1.2 and 1.3 are equivalent, 1.5 and 1.6 are equivalent. 1.7 does not exist.

2) Yes for 2.1-2.5.
2.6) If model(diff) was already specified, then specifying it again within iv() is redundant. It does not change anything.
2.7) This option does not exist.

3) Without separate option model(diff), the default is model(level). Then, 1.1 and 1.2 would be equivalent, 1.4 and 1.5 would be equivalent.

4) With separate option model(fod), none of the specifications are equivalent.

https://www.kripfganz.de/stata/
Comment

Zainab Mariam

Join Date: Jul 2022
Posts: 51

#513

10 Feb 2023, 09:37

Dear Professor Sebastian,

Many many thanks for your swift and useful reply. I want to express my deep gratitude for the dedicated work you do day after day. Your input is so valuable, Professor!

1) Using your xtdpdgmm command, suppose the following code is typed to apply the Difference GMM estimator:

xtdpdgmm L(0/2).y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10, model(fod) collapse gmm(y, lag(1 3)) gmm(L.x1, lag(1 3)) gmm(L.x2, lag(0 2)) gmm(L.x3, lag(0 2)) gmm(L.x4, lag(0 2)) gmm(L.x5, lag(0 2)) gmm(L.x6, lag(0 2)) gmm(L.x7, lag(0 2)) gmm(L.x8, lag(0 2)) gmm(L.x9, lag(0 2)) gmm(x10, lag(0 2)) gmm(x10, lag(0 0) model(md)) teffects two small vce(r) nocons overid

Then, the serial correlation and overidentification tests are applied. After that, the difference-in-Hansen test is performed. Thus, I have the following questions, please!

1.1) What is the meaning of the following findings of the row/line labelled “model(level)” in the difference-in-Hansen test?

Table 1

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
12, model(level)	12.0861	7	0.0978	4.1102	2	0.0881
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.0428	8.1042	3	0.0315

Table 2

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
12, model(level)	12.0861	7	0.0397	4.1102	2	0.1364
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.0327	8.1042	3	0.1011

Table 3

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
12, model(level)	12.0861	7	0.1231	4.1102	2	0.0272
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.1114	8.1042	3	0.0439

Table 4

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
12, model(level)	12.0861	7	0.0779	4.1102	2	0.1281
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.1015	8.1042	3	0.1026

1.2) Do the above findings of the row/line labelled “model(level)” in the difference-in-Hansen test indicate that the variables satisfy or violate the additional Blundell-Bond assumption (sufficient: mean stationarity)?

1.3) Do the above findings of the row/line labelled “model(level)” in the difference-in-Hansen test indicate that I can instrument the variables in the level model? Do the above findings of the row/line labelled “model(level)” in the difference-in-Hansen test indicate that I can apply the System GMM estimator?

1.4) What do the above findings of the difference-in-Hansen test indicate regarding the variables classification whether exogenous, predetermined, or endogenous?

Thank you in advance for your time and hard work. I am so thankful for everything you bring to my understanding, Professor!

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#514

15 Feb 2023, 08:53

1.1) Please see my earlier post #506 for some general information. Here, it seems that the only instruments specified for the level model are the time dummies. In this case, the difference-in-Hansen test for them is not meaningful. We cannot leave out the instruments for the time dummies.

1.2) Because the time dummies are the only instruments for the level model, the Blundell-Bond assumption does not apply here.

1.3) Again, this row in this case has no meaningful interpretation.

https://www.kripfganz.de/stata/
Comment
Zainab Mariam

Join Date: Jul 2022

Posts: 51
#515

15 Feb 2023, 10:47

Dear Professor Sebastian,

Even though I may not say it all the time, I do appreciate all that you do, Professor! I do not know what to say. Much obliged!

1) Regarding post #506 point 6) “With the difference GMM estimator, the difference-in-Hansen test can still be useful to evaluate the validity of specific instrument sets. This could for example help to decide whether variables should be classified as endogenous, predetermined, or exogenous; see the model selection section of my presentation.”.

Thus, I have the following questions, please!

1.1) Does it mean the difference-in-Hansen test cannot help to decide whether variables should be classified as endogenous, predetermined, or exogenous if we apply the difference-in-Hansen test with the System GMM estimator?

1.2) Does it mean the difference-in-Hansen test cannot help to check if the variables satisfy the additional Blundell-Bond assumption (sufficient: mean stationarity) if we apply the difference-in-Hansen test with the Difference GMM estimator? Does it mean the difference-in-Hansen test cannot help to check if I can instrument the variables in the level model when we apply the difference-in-Hansen test with the Difference GMM estimator? Does it mean the difference-in-Hansen test cannot help to check if I can apply the System GMM estimator when we apply the difference-in-Hansen test with the Difference GMM estimator?

2) Regarding post #508 point 1) “In principle, the MMSC can be used for selecting between the difference and system GMM estimator, yes. If different criteria give you different answers, I am afraid then the decision is still up to you. You will then need to weigh the benefits and shortcomings of the two estimators. As mentioned earlier, a good compromise might be the difference GMM estimator plus nonlinear moment conditions (Ahn-Schmidt).”. And regarding post #504 point 3) “… Alternatively, you could use the nonlinear Ahn and Schmidt (1995, Journal of Econometrics) estimator, which also mitigates the weak-instruments problem but does not require the additional system GMM assumptions.”.

Thus, I have the following questions, please!

2.1) Does it mean it is better to apply the nonlinear Ahn and Schmidt estimator? If so, are the following codes correct?

2.2) In this code, I specified ‘model(fod)’ as a separate option in the xtdpdgmm command line, I put ‘model(md)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(md)) iv(mn, model(md)) iv(fc, model(md)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.3) In this code, I put ‘model(level)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(level)) iv(mn, model(level)) iv(fc, model(level)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.4) In this code, I put ‘diff’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, diff) iv(mn, diff) iv(fc, diff) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.5) In this code, I put ‘diff model(diff)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, diff model(diff)) iv(mn, diffmodel(diff)) iv(fc, diffmodel(diff)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.6) In this code, I put ‘model(diff)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(diff)) iv(mn, model(diff)) iv(fc, model(diff)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.7) In this code, I put ‘diff model(level)’ in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind, diff model(level)) iv(mn, diffmodel(level)) iv(fc, diffmodel(level)) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.8) In this code, I did not put anything in the iv() option for the dummies.

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 4)) gmm(L.x1, lag(1 4)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(0 3)) gmm(x10, lag(0 2)) gmm(x10, model(md) lag(0 0)) iv(i.ind) iv(mn) iv(fc) gmm(cf*L.x1, lag(1 3)) nl(noserial)) teffects two small vce(robust, dc) overid

2.9) In this code, I specified ‘model(mdev)’ as a separate option in the xtdpdgmm command line, I put ‘model(diff)’ in the gmm() for the endogenous variables (y, L.x1), I put ‘norescale’ in the iv() for the exogenous variable (x10), I put ‘model(md)’ in the iv() option for the dummies, and I did not put any option in gmm() for the predetermined variables (L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9).

xtdpdgmm L.(0/1) y L.(x1 x2 x3 x4 x5 x6 x7 x8 x9) x10 i.ind mn cf cf*L.x1, model(mdev) collapse gmm(y, lag(2 4) model(diff)) gmm(L.x1, lag(2 4) model(diff)) gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(1 3)) iv(x10, norescale) gmm(x10, model(md) lag(0 0)) iv(i.ind, model(md)) iv(mn, model(md)) iv(fc, model(md)) gmm(cf*L.x1, lag(2 4) model(diff)) nl(noserial)) teffects two small vce(robust, dc) overid

2.10) If none of the previous codes is correct, what is the correct code I have to use in order to implement the nonlinear Ahn and Schmidt estimator using your xtdpdgmm command? Where: y is the dependent variable; L.y is the lagged dependent variable as a regressor (L.y is predetermined); L.x1 is the independent variable (L.x1 is endogenous); The control variables L.x2, L.x3, L.x4, L.x5, L.x6, L.x7, L.x8, L.x9 are predetermined; The control variable x10 (firm age) is exogenous; ind is industry dummies; mn is country dummies; cf is a dummy variable that takes the value of 1 for the 3 years 2008, 2009, and 2010; cf*L.x1 is an interaction between the dummy variable cf and the independent variable L.x1.

2.11) To apply the nonlinear Ahn and Schmidt estimator, is it better to specify ‘model(fod)’ or ‘model(diff)’ or ‘model(mdev)’ as a separate option in the xtdpdgmm command line?

3) What if the Difference-in-Hansen test’s results do not obtain “model(level)” in the last line/row of the Difference-in-Hansen test table? What does that indicate?

4) Is it normal for all the industry dummies to be omitted if I put ‘md’ in the iv() option for the industry dummies along with not typing ‘teffects’ in the xtdpdgmm command line?

Also, is it normal for more than one industry dummy to be omitted if I put ‘md’ in the iv() option for the industry dummies even with typing ‘teffects’ in the xtdpdgmm command line? What are the iv() options that lead the dummies to be omitted?

Sorry to keep asking you my questions, but I would not have understood this without your assistance. Please accept my deepest gratitude. Your patience, help and effort are greatly appreciated, Professor! Thank you very much for all you do.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#516

16 Feb 2023, 12:07

1.1) The approach can also be used with the system GMM estimator if you are confident that the additional assumption for validity of the instruments in the level model is satisfied.

1.2) With the difference GMM estimator alone, you cannot test the additional assumption for the system GMM estimator.

2.1) You do not lose much with the nonlinear estimator compared to the difference GMM estimator. So, yes, it is often preferable to use the nonlinear estimator.

2.2-2.8) All of these specifications are valid. Some of them are unusual/unconventional, e.g. 2.4).

2.9) Here, the option gmm(L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9, lag(1 3)) would only be valid if all of these variables were strictly exogenous.

2.10) Somewhere earlier in this thread I gave examples for different estimators, including the Ahn-Schmidt estimator.

2.11) model(mdev) should only be specified for strictly exogenous variables. Otherwise, both model(fod) and model(diff) are fine. (Just remember that in general the admissible lags differ for the two models; e.g. for an endogenous variable, the first admissible lag is 1 with model(fod) but 2 with model(diff).)

3) If you do not have multiple instrument sets for model(level), then the difference-in-Hansen test does not perform a separate test for it.

4) This might be because the industry dummies are time-invariant. Such variables can only be specified for model(level).

https://www.kripfganz.de/stata/
Comment
Zainab Mariam

Join Date: Jul 2022

Posts: 51
#517

16 Feb 2023, 16:09

Dear Professor Sebastian,

I am so thankful for what you did. You are so helpful. I do appreciate the way you are teaching and supporting me. Your assistance means a lot to me, Professor!

1) Regarding post #514 “Here, it seems that the only instruments specified for the level model are the time dummies. In this case, the difference-in-Hansen test for them is not meaningful. We cannot leave out the instruments for the time dummies. Because the time dummies are the only instruments for the level model, the Blundell-Bond assumption does not apply here.”.

Thus, I have the following questions, please!

1.1) Does that mean the difference-in-Hansen test on pages 96, 109, 113, and 123 is not meaningful? Does that mean the Blundell-Bond assumption does not apply there in those tables on pages 96, 109, 113, and 123?

1.2) At least how many variables do I have to instrument for the level model in order for the difference-in-Hansen test to be meaningful and for the Blundell-Bond assumption to be applied?

2) Regarding the meaning of the following iv() options, given I have not specified ‘model(diff)’ as a separate option in your xtdpdgmm command line, is the following meaning of iv() options correct?

2.1) iv(x, lag( ) diff model(level)): produces differenced instruments for the level model?

2.2) iv(x, lag( ) diff): produces differenced instruments for the level model?

2.3) iv(x, lag( ) model(diff) diff): produces differenced instruments for the differenced model?

2.4) iv(x, lag( ) model(level)): produces level instruments for the level model?

2.5) iv(x, lag( )): produces level instruments for the level model?

2.6) iv(x, lag( ) model(diff)): produces level instruments for the differenced model?

3) Regarding the meaning of the following iv() options, given I have specified ‘model(FOD)’ as a separate option in the xtdpdgmm command line, is the following meaning of iv() options correct?

3.1) iv(x, lag( ) diff model(level)): produces differenced instruments for the level model?

3.2) iv(x, lag( ) diff): produces differenced instruments for the FOD model?

3.3) iv(x, lag( ) model(diff) diff): produces differenced instruments for the differenced model?

3.4) iv(x, lag( ) model(level)): produces level instruments for the level model?

3.5) iv(x, lag( )): produces level instruments for the FOD model?

3.6) iv(x, lag( ) model(diff)): produces level instruments for the differenced model?

4) Regarding post #506 point 6) “With the difference GMM estimator, the difference-in-Hansen test can still be useful to evaluate the validity of specific instrument sets.…”.

Thus, which instrument sets specifically the difference-in-Hansen test with the Difference GMM estimator can evaluate their validity?

5) To check my understanding, please, correct me if I am wrong!

5.1) The coefficient of L.y (L.y is the lagged dependent variable) based on the Difference GMM estimator indicates whether the dependent variable (y) is persistent, and hence, it indicates if the Difference GMM estimator poorly behaves? The lagged dependent variable’s coefficient obtained from applying the Difference GMM estimator refers to whether the dependent variable (y) is close to a random walk and if the Difference GMM estimator performs poorly? If the coefficient of L.y is close to 1, that indicates that the dependent variable (y) is persistent and the Difference GMM estimator yields poor performance due to the poor instruments?

5.2) Difference-in-Hansen test with the Difference GMM checks for variables classification?

5.3) Difference-in-Hansen test with the Difference GMM cannot check for the additional Blundell-Bond assumption (sufficient: mean stationarity)?

5.4) Difference-in-Hansen test with the System GMM cannot check for variables classification?

5.5) Difference-in-Hansen test with the System GMM checks for the additional Blundell-Bond assumption (sufficient: mean stationarity)?

Many thanks for doing what you do! Your patience, help and effort are greatly appreciated, Professor!
Comment
Sarah Magd

Join Date: Feb 2022

Posts: 60
#518

05 Mar 2023, 05:38

Dear Prof. @Kripfganz,

I specify my model in a static way (i.e., without including the lagged dependent variable as a regressor).

1. Can we still use the sys-GMM to estimate this static regression?
2. How should I justify the use of the sys-GMM to estimate this static regression? (i.e., is it more efficient or robust than the 2SLS regression?)
3. Do I still need to report the Arellano-Bond statistics?
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#519

06 Mar 2023, 08:16

Originally posted by Sarah Magd View Post

I specify my model in a static way (i.e., without including the lagged dependent variable as a regressor).

1. Can we still use the sys-GMM to estimate this static regression?
2. How should I justify the use of the sys-GMM to estimate this static regression? (i.e., is it more efficient or robust than the 2SLS regression?)
3. Do I still need to report the Arellano-Bond statistics?[/B]

Yes.

2SLS is generally inefficient when using panel data. In any case, "2SLS" is not very informative; you would need to be clear about the instruments you are using. It then becomes a question of whether the instruments used in your sys-GMM estimator are beneficial compared to the instruments used in your 2SLS approach. In the first place, you would need to check whether they might require different assumptions for validity.

Reporting serial correlation tests is generally still useful even in static models. For once, they might tell you whether a dynamic model could be reasonable (to account for any serial correlation detected in the static model). If you have predetermined/endogenous variables, serial correlation can still invalidate the instruments in a static model.

https://www.kripfganz.de/stata/
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#520

06 Mar 2023, 08:35

Zainab Mariam

1.1) You would need to be more specific about which of these difference-in-Hansen tests you are refering to. Each row in the table is a separate test. For example, on page 96, the test in the row labelled "5, model(level)" is not very useful, because removing the instruments for the time dummies (while keeping the time dummies as regressors) does not make much sense and may result in serious weak-instruments problems or even underidentification.

1.2) I am not sure I understand the question. There is no such minimum number of variables.

2) Yes to all of them.

3) Yes to all of them.

4) The respective labels for the rows of the test output tell you which instruments are being tested; compare with the list of instruments below the regression output.

5.1) Not necessarily. A small estimate of the lagged dependent variable's coefficient might potentially be a consequence of a large bias of the difference GMM estimator if the true coefficient is close to 1. Thus, it is difficult to learn about potential problems of the estimator from actual coefficient estimates. You could again use the difference GMM estimator with additional nonlinear moment conditions, which is less prone to problems under high persistence. If that estimator yields a large estimate of the autoregressive coefficient, this might indicate potential problems of the difference GMM estimator (without such nonlinear moment conditions). And then you can also compare the two estimates; if they are very different, the estimator without the nonlinear moment conditions probably has problematic properties.

5.2) That's the typical application, yes.

5.3) Correct.

5.4) You can still check for correct variable classification with the system GMM estimator. (Essentially, everything you can do with the difference GMM estimator, you can also do with the system GMM estimator). However, this should generally be done first with the difference GMM estimator (ideally also using nonlinear moment conditions), to avoid contamination of the tests with a potential invalidity of the mean stationarity assumption.

5.5) That's the typical application, yes.

https://www.kripfganz.de/stata/
Comment

Zainab Mariam

Join Date: Jul 2022
Posts: 51

#521

09 Mar 2023, 08:51

Dear Professor Sebastian,

Many thanks for your time and hard work. Thank you very much for helping out when I needed it. The support that you show is extremely appreciated, Professor!

Question 1) Suppose the following outcomes of the difference-in-Hansen test.

Table 1

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
1, model(fodev)	38.4506	17	0.0021	5.6365	3	0.0580
2, model(fodev)	32.2911	17	0.0138	11.7960	3	0.0081
3, model(fodev)	38.2406	17	0.0023	5.8465	3	0.1193
4, model(fodev)	43.0624	17	0.0005	1.0247	3	0.7953
5, model(fodev)	42.8693	17	0.0005	1.2178	3	0.7487
6, model(fodev)	42.9069	17	0.0005	1.1803	3	0.7577
7, model(fodev)	38.2817	17	0.0022	5.8054	3	0.0775
8, model(fodev)	39.2517	17	0.0016	4.8354	3	0.0689
9, model(fodev)	39.2432	17	0.0017	4.8439	3	0.1836
10, model(fodev)	34.1913	17	0.0079	9.8958	3	0.0195
11, model(fodev)	41.1743	19	0.0023	2.9128	1	0.0879
12, model(mdev)	34.6609	19	0.0153	9.4262	1	0.0021
13, model(level)	29.6163	19	0.1070	3.0659	1	0.1087
14, model(level)	0.3539	1	0.1517	43.7332	19	0.0510
model(fodev)	.	-11	.	.	.	.

Table 2 (since the row/line labelled “model(level)” in the difference-in-Hansen test is needed, I kept its findings and deleted the rest).

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
1, model(fodev)
2, model(fodev)
3, model(fodev)
4, model(fodev)
5, model(fodev)
6, model(fodev)
7, model(fodev)
8, model(fodev)
9, model(fodev)
10, model(fodev)
11, model(fodev)
12, model(mdev)
13, model(level)
14, model(level)	12.0861	7	0.0978	4.1102	2	0.0881
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.0448	8.1042	3	0.0447

Table 3

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
1, model(fodev)
2, model(fodev)
3, model(fodev)
4, model(fodev)
5, model(fodev)
6, model(fodev)
7, model(fodev)
8, model(fodev)
9, model(fodev)
10, model(fodev)
11, model(fodev)
12, model(mdev)
13, model(level)
14, model(level)	12.0861	7	0.0397	4.1102	2	0.1364
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.0398	8.1042	3	0.1298

Table 4

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
1, model(fodev)
2, model(fodev)
3, model(fodev)
4, model(fodev)
5, model(fodev)
6, model(fodev)
7, model(fodev)
8, model(fodev)
9, model(fodev)
10, model(fodev)
11, model(fodev)
12, model(mdev)
13, model(level)
14, model(level)	12.0861	7	0.1231	4.1102	2	0.0786
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.1214	8.1042	3	0.0489

Table 5

	Excluding			Difference
Moment conditions	chi2	df	p	chi2	df	p
1, model(fodev)
2, model(fodev)
3, model(fodev)
4, model(fodev)
5, model(fodev)
6, model(fodev)
7, model(fodev)
8, model(fodev)
9, model(fodev)
10, model(fodev)
11, model(fodev)
12, model(mdev)
13, model(level)
14, model(level)	12.0861	7	0.0779	4.1102	2	0.1281
model(diff)	0.0000	0	.	16.1962	9	0.0629
model(level)	8.0920	6	0.1095	8.1042	3	0.1087

I kindly ask you please the following questions!

1.1) Which table(s) of the above findings of the difference-in-Hansen test indicate that the Difference GMM estimator is fine? Which table(s) of the above findings of the difference-in-Hansen test indicate that the instruments for the Difference GMM estimator are valid?

1.2) Which table(s) of the above findings of the difference-in-Hansen test indicate that the variables satisfy/violate the additional Blundell-Bond assumption (sufficient: mean stationarity)?

1.3) Which table(s) of the above findings of the difference-in-Hansen test indicate that I can instrument the variables in the level model?

1.4) Which table(s) of the above findings of the difference-in-Hansen test indicate that I can apply the System GMM estimator?

1.5) Which table(s) of the above outcomes of the difference-in-Hansen test indicate that the Difference GMM estimator is superior to the System GMM estimator?

1.6) Which table(s) of the above outcomes of the difference-in-Hansen test indicate that the System GMM estimator is superior to the Difference GMM estimator? Which of the above findings of the difference-in-Hansen test indicate that the additional instruments for the level model are valid?

1.7) What do the above findings of the difference-in-Hansen test indicate regarding the variables classification whether exogenous, predetermined, or endogenous?

1.8) In case I can apply the System GMM estimator, are the following codes correct to apply the System GMM estimator using your xtdpdgmm command?

A) In this code, I specified ‘model(fod)’ as a separate option in the xtdpdgmm command line, I instrument all the variables (except the dummies) for the differenced model, I instrument all the variables (including the dummies) for the level model, I put ‘model(level)’ in the iv() option for the dummies as follows.

xtdpdgmm L(0/1).y L(0/1).x1 L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 3)) gmm(L.x1, lag(1 3)) gmm(L.x2, lag(0 2)) gmm(L.x3, lag(0 2)) gmm(L.x4, lag(0 2)) gmm(L.x5, lag(0 2)) gmm(L.x6, lag(0 2)) gmm(L.x7, lag(0 2)) gmm(L.x8, lag(0 2)) gmm(L.x9, lag(0 2)) gmm(x10, lag(0 2)) gmm(x10, lag(0 0) model(md)) gmm(cf*L.x1, lag(1 3)) gmm(y, lag(1 1) diff model(level)) gmm(L.x1, lag(1 1) diff model(level)) gmm(L.x2, lag(0 0) diff model(level)) gmm(L.x3, lag(0 0) diff model(level)) gmm(L.x4, lag(0 0) diff model(level)) gmm(L.x5, lag(0 0) diff model(level)) gmm(L.x6, lag(0 0) diff model(level)) gmm(L.x7, lag(0 0) diff model(level)) gmm(L.x8, lag(0 0) diff model(level)) gmm(L.x9, lag(0 0) diff model(level)) gmm(x10, lag(0 0) diff model(level)) gmm(cf*L.x1, lag(1 1) diff(model(level)) iv(i.ind, model(level)) iv(mn, model(level)) iv(cf, model(level)) two small vce(r) overid

B) In this code, I specified ‘model(fod)’ as a separate option in the xtdpdgmm command line, I instrument all the variables (except the dummies) for the differenced model, I instrument all the variables (including the dummies) for the level model, I put ‘model(md)’ in the iv() option for the dummies as follows.

xtdpdgmm L(0/1).y L(0/1).x1 L.x2 L.x3 L.x4 L.x5 L.x6 L.x7 L.x8 L.x9 x10 i.ind mn cf cf*L.x1, model(fod) collapse gmm(y, lag(1 3)) gmm(L.x1, lag(1 3)) gmm(L.x2, lag(0 2)) gmm(L.x3, lag(0 2)) gmm(L.x4, lag(0 2)) gmm(L.x5, lag(0 2)) gmm(L.x6, lag(0 2)) gmm(L.x7, lag(0 2)) gmm(L.x8, lag(0 2)) gmm(L.x9, lag(0 2)) gmm(x10, lag(0 2)) gmm(x10, lag(0 0) model(md)) gmm(cf*L.x1, lag(1 3)) gmm(y, lag(1 1) diff model(level)) gmm(L.x1, lag(1 1) diff model(level)) gmm(L.x2, lag(0 0) diff model(level)) gmm(L.x3, lag(0 0) diff model(level)) gmm(L.x4, lag(0 0) diff model(level)) gmm(L.x5, lag(0 0) diff model(level)) gmm(L.x6, lag(0 0) diff model(level)) gmm(L.x7, lag(0 0) diff model(level)) gmm(L.x8, lag(0 0) diff model(level)) gmm(L.x9, lag(0 0) diff model(level)) gmm(x10, lag(0 0) diff model(level)) gmm(cf*L.x1, lag(1 1) diff(model(level)) iv(i.ind, model(md)) iv(mn, model(md)) iv(cf, model(md)) two small vce(r) overid

Where:
y is the dependent variable;
L.y is the lagged dependent variable as a regressor (L.y is predetermined);
L.x1 is the independent variable (L.x1 is endogenous);
The control variables L.x2, L.x3, L.x4, L.x5, L.x6, L.x7, L.x8, L.x9 are predetermined;
The control variable x10 (firm age) is exogenous;
ind is industry dummies;
mn is country dummies;
cf is a dummy variable that takes the value of 1 for the 3 years 2008, 2009, and 2010;
cf*L.x1 is an interaction between the dummy variable cf and the independent variable L.x1.

1.9) If the previous codes are incorrect to apply the System GMM estimator, what do I have to add, delete, amend in the previous codes to apply the System GMM estimator using your xtdpdgmm command?

Are there other codes which are more appropriate to apply the System GMM estimator using your xtdpdgmm command?

Question 2) Regarding post #514 “Here, it seems that the only instruments specified for the level model are the time dummies. In this case, the difference-in-Hansen test for them is not meaningful. We cannot leave out the instruments for the time dummies. Because the time dummies are the only instruments for the level model, the Blundell-Bond assumption does not apply here.”.

Sorry, I did not get what you mean by that. I kindly ask you please to explain what you mean.

Question 3) Regarding post #520 point 1.1) “You would need to be more specific about which of these difference-in-Hansen tests you are refering to. Each row in the table is a separate test. For example, on page 96, the test in the row labelled "5, model(level)" is not very useful, because removing the instruments for the time dummies (while keeping the time dummies as regressors) does not make much sense and may result in serious weak-instruments problems or even underidentification.”.

Sorry for not being specific there. In all the tables of the difference-in-Hansen test on pages 96, 109, 113, and 123, the row labelled “5, model(level); 8, model(level); 8, model(level); 8, model(level)”, respectively. The respective labels for those rows of the test output are for the time dummies (according to the list of instruments below the regression output). Thus, does that mean the difference-in-Hansen test on pages 96, 109, 113, and 123 is not meaningful? Does that mean the Blundell-Bond assumption does not apply there in those tables on pages 96, 109, 113, and 123?

Your input is so valuable. The work you do is very important and so appreciated. I am very grateful to you for all your patience, help and effort, Professor!

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#522

12 Mar 2023, 06:01

1.1-1.7) I cannot/should not answer those questions without seeing your command line, instruments used, and regression ouput. Eventually, a holistic approach should be used.

1.8-1.9) Assuming that ind and md are time-invariant, the respective instruments for model(md) will be dropped. In that regard, specification A is more reasonable. Other than that, the specifications can be valid.

2) The difference-in-Hansen test can be seen as a test comparing two estimators, one with the instruments under investigation and one without them. You would never consider an estimator without instruments for the time dummies, when such time dummies are included as regressors, because then necessary instruments are missing, which leads to weak identification/underidentification of the model. Therefore, the comparison estimator is rubbish and the test not meaningful. In other words, there is no point testing the validity of time dummy instruments. These are always valid.

3) The test in those rows for the time dummies should always be ignored. It is not meaningful; see 2).

Maybe I should clarify one aspect: When testing for the Blundell-Bond assumption, we need to compare two estimators that both use instruments for the time dummies. We should therefore (for the purpose of this test), use the time dummy instruments in the transformed model, not the level model !!! Once we have done the test, we can then possibly change the specification by including the time dummy instruments for the level model instead.

Please understand that I won't be able to continue my detailed answers in that frequence here on Statalist due to other obligations. Such detailed help would normally require a (paid) consultancy agreement.

https://www.kripfganz.de/stata/
Comment
Sarah Magd

Join Date: Feb 2022

Posts: 60
#523

31 Mar 2023, 08:36

Originally posted by Sebastian Kripfganz View Post

Yes.

2SLS is generally inefficient when using panel data. In any case, "2SLS" is not very informative; you would need to be clear about the instruments you are using. It then becomes a question of whether the instruments used in your sys-GMM estimator are beneficial compared to the instruments used in your 2SLS approach. In the first place, you would need to check whether they might require different assumptions for validity.

Reporting serial correlation tests is generally still useful even in static models. For once, they might tell you whether a dynamic model could be reasonable (to account for any serial correlation detected in the static model). If you have predetermined/endogenous variables, serial correlation can still invalidate the instruments in a static model.

Dear Prof. Sebastian Kripfganz
When we use sys-GMM with a static model, how should I justify the selection of this estimator? Do I need to do any further steps or compare it with other estimators?
Or should I use it as a robustness check?
I am just confused about how I should justify the use of sys-GGM to estimate a static model
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#524

31 Mar 2023, 08:42

The first justification would be that you have endogenous regressors but no suitable external instruments; therefore, you are using internal instruments (lagged transformed regressors).

Compared to the difference-GMM estimator, the justification for the system-GMM estimator would still be that it is more efficient because of the extra instruments it uses. The validity of these extra instruments of course needs to be justified, typically with a difference-in-Hansen test comparing the system-GMM to the difference-GMM estimator. In a nutshell, the arguments are very similar to those for a dynamic model.

The static model is a special case of the dynamic model without the lagged dependent variable. What is good for the more general dynamic model cannot be bad for the restricted static model.

https://www.kripfganz.de/stata/
1 like
Comment
Mugabil Isayev

Join Date: Apr 2023

Posts: 4
#525

08 Apr 2023, 12:05

Dear Sebastian Kripfganz,

I want to simultaneously take into account cross-sectional dependence and endogeneity problems. Is it possible to incorporate Driscoll Kraay standard errors to xtdpdgmm or xtabond2?

Thanks in advance.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment