You are not logged in. You can browse but not post. Login or Register by clicking 'Login or Register' at the top-right of this page. For more information on Statalist, see the FAQ.
I am unable to replicate this error message. I do not see how the option teffects can cause such a problem. Would it be possible for you to send me your data set by e-mail, so that I can investigate further?
Dear Prof. Sebastian Kripfganz,
I have grouped my variables into endogenous and predetermined variables. However, I have been asked to test for the exogeneity and predeterminant of these variables (in other words, to provide evidence that my variables are correctly specified and grouped). Could you please guide us on how we can test whether the grouping of variables is correct or not with xtdpdgmm? We would highly appreciate it if you could provide an example with a code.
Dear Prof. Sebastian Kripfganz,
I am estimating a two-step system GMM model with l.Y X2 X3 X4 being my endogenous variables. The results of this model are consistent with the literature. However, when I estimate a two-step difference GMM model, the results become inconsistent, especially for the lagged dependent variable. Could you please help me figure out the problem?
My sample ranges from 2005 to 2017.
Arellano-Bond test for autocorrelation of the first-differenced residuals
H0: no autocorrelation of order 1: z = -3.7914 Prob > |z| = 0.0001
H0: no autocorrelation of order 2: z = 1.6378 Prob > |z| = 0.1015
H0: no autocorrelation of order 3: z = 0.9998 Prob > |z| = 0.3174
################################################## ################################################## #########
#Difference GMM model
################################################## ################################################## ##########
Arellano-Bond test for autocorrelation of the first-differenced residuals
H0: no autocorrelation of order 1: z = -2.2317 Prob > |z| = 0.0256
H0: no autocorrelation of order 2: z = -0.8144 Prob > |z| = 0.4154
H0: no autocorrelation of order 3: z = 1.2586 Prob > |z| = 0.2082
I have grouped my variables into endogenous and predetermined variables. However, I have been asked to test for the exogeneity and predeterminant of these variables (in other words, to provide evidence that my variables are correctly specified and grouped). Could you please guide us on how we can test whether the grouping of variables is correct or not with xtdpdgmm? We would highly appreciate it if you could provide an example with a code.
In my 2019 London Stata Conference presentation, I demonstrate starting on slide 90 how to empirically classify the variables as endogenous, predetermined, or strictly exogenous. You would begin with classifying all variables as endogenous. Then you can add further instruments which are valid if the variables are predetermined. Incremental overidentification tests can be used to check whether those additional instruments are valid, and therefore if the variables are indeed predetermined. Later, you can do the same for strict exogeneity, by adding the respective additional instruments and checking the incremental overidentification tests again.
In your example with the system and difference GMM estimator, I would recommend to use the same lag orders for both. Using lag(1 3) for the system GMM estimator but lag(1 5) for the difference GMM estimator gives the impression to a reader of your analysis that you are cherry picking results. Furthermore, it is highly unusual to not include a regression constant when using the system GMM estimator. This might explain the observed differences. Aside from that, the difference GMM estimator might suffer from an identification problem if the true value (not the estimated value) of the lagged dependent variable's coefficient is close to 1. You could try adding the nl(noserial) option to the difference GMM estimator. This gives you the Ahn-Schmidt estimator, which could be of help avoiding this identification problem of the difference GMM estimator.
Dear Prof. Sebastian Kripfganz,
Thank you very much for your reply. I have two other questions.
1. May you please clarify what the difference is between the following two commands?
1.1. xtdpdgmm L(0/1).ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr ln_fd_1 hc ln_Total_2 , model(diff) collapse gmm(l.ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr, lag(1 3)) gmm( ln_fd_1 hc ln_Total_2 , lag(1 3)) gmm(ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr, lag(1 1) diff model(level)) gmm( ln_fd_1 hc ln_Total_2 , lag(0 0) diff model (level)) two vce(r) overid noconstant
1.2. xtdpdgmm L(0/1).ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr ln_fd_1 hc ln_Total_2, collapse gmm (l.ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr, lag(1 3) model(diff)) gmm(ln_fd_1 hc ln_Total_2, lag(1 3) model(diff)) gmm(l.ln_energy_total_cap_mwh ln_gdpc ln_industry_gdp ln_real_oilpr, lag(1 1) diff) gmm(ln_fd_1 hc ln_Total_2, lag(0 0) diff) two vce(r)
######################### 2. my sample size is 332 observations with the time period from 2005 to 2017. when I estimate the two-step GMM model, it passes the Hansen test. However, I get the following result for the Hansen test from the one-step GMM model (i.e., the system GMM). I do not understand why I have this result with the one-step model. Would you please guide me on this?
Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid
1. As far as I can tell from glancing over the two command lines, the only difference seems to be that the first command has the additional options overid noconstant. Thus, it calculates additional incremental overidentification statistics that can be displayed with the postestimation command estat overid, difference. And it constrained the constant to be zero.
2. The Sargan-Hansen tests after the one-step system GMM estimation are based on a weighting matrix which is not optimal. Those tests are asymptotically invalid. I would not expect too much from them either with your very small number of groups.
Normally, you would start with the difference GMM estimator and then use the system GMM estimator as a robustness check, or to improve on the estimates if you suspect identification problems with the difference GMM estimator.
Dear Prof. Sebastian Kripfganz,
1- I have included the first difference of a variable in the main equation of my regression. Can I include this variable in level as an instrumental variable in the differential equation?
2- if we include the logarithmic transformation of a variable, can we include its level form as an instrument?
so the two questions are related to whether we can use a different variable as an instrument or do we have to use the same variable as an instrument in the gmm?
You can use as an instrument whatever variable you have. The only requirements are that they are sufficiently strongly correlated with the instrumented variable and that they are uncorrelated with the error term.
Dear Prof. Sebastian Kripfganz,
Thank you very much for your prompt reply.
This means that I can use any instrument as long as the AR(2) and Hansen test statistics are statistically insignificant. Am I right?
For many situations, xtdpdgmm and xtabond2 are equivalent. Both can compute the conventional difference-GMM and system-GMM estimators. For many people, the choice between the commands may be a matter of taste. The syntax of the commands can become quite complicated given all the available options. xtdpdgmm follows the approach that you get what you type, which requires the user to have a good understanding of the econometric methods. (It is encouraged in any case that you understand the econometric methods you are using and do not just rely on some default software settings.) With xtabond2, you might have to be a bit more careful - at least in my opinion - that you really get what you want when specifying the options.
xtdpdgmm has a few extensions which you do not find in xtabond2, such as nonlinear moment conditions, the flexibility to use the within-groups transformation (analogously to the traditional fixed-effects estimator), the iterated GMM estimator, and a few additional postestimation statistics.
I am obviously not impartial. You might just give both commands a try and see how comfortable you are with them. (Feel free to feed back your experience here.) An advantage of having both commands is that you can double check your results. If implemented correctly, both commands should give you the same results.
While I was doing some code improvements (which led to some speed improvement, especially when the command is called for the first time during a Stata session), I discovered a bug in the prediction of scores - postestimation command predict with option score - when xtdpdgmm computed robust standard errors with the Windmeijer correction. The WC-robust standard errors displayed in the regression output were correct, but the postestimation commands relying on the predicted scores - such as estat serial, estat hausman, suest etc. would produce slightly incorrect results. (The differences are not expected to be large.)
This bug has been corrected in the latest version 2.3.11, which is now available on my website:
Code:
net install xtdpdgmm, from(http://www.kripfganz.de/stata) replace
Note for replication purposes: This update might lead to slightly different results when trying to replicate previous results for the Arellano-Bond test and the generalized Hausman test.
I am regressing the log of GDP per capita (ln_GDPc) on renewable energy consumption (REN), using the two-step sys GMM model for a sample of 26 countries over the period 2005 to 2017. The panel unit root analysis shows that ln_GDPc is stationary in first difference, whereas REN is stationary in level. This leads to two cases:
- Case 1: If I regress ln_GDPc in level on REN: the coefficient of ln_GDPc is 0.3 and statistically significant.
- Case 2: If I regress ln_GDPc in the first difference (Δln_GDPc) on REN: the coefficient of Δln_GDPc becomes 0.9 and statistically significant.
In this context, are the two cases equivalent to each other? Should I care about the stationarity given the small-time period I have?
Comment