Hi everyone,
I'm trying to estimate the effect of public spending of a certain kind on the real GDP per capita growth rate. I'm trying to use the xtdpdgmm command to perform the Two-Step Diff and Two-Step Sys GMM, but I'm not entirely sure I understand the syntax fully.
Let me explain how I've arranged the dataset, otherwise, the syntax I'm going to show you won’t be clear. I have a balanced panel dataset with 35 countries (unfortunately N is not particularly large) and 25 years, and I use five-year non-overlapping averages, resulting in 5 observations per country in periods 1990-1994, 1995-1999,...,2010-2014.
The equation to estimate is:
yit - yi,t-x = (β1 - 1) yi,t-x + β2 hi,t-x + β3 xit + αi + δt + uit
where t=1994,1999,2004,2009,2014 and x=5, except for the first period where x=4. Further, hit is assumed to be predetermined and xit = Mean of xi from time t-x+1 to time t. For instance, at t=1999, xi1999 denotes the average of xi from year 1995 to year 1999.
For each country i, my dataset in stata has 5 rows (all full), and the first row for country i, has the following columns:
i) the dependent variable (yi,1994 - yi,1990 ) is named gdp_growth
ii) the AR part, yi,1990, is named gdp_lag
iii) the predetermined variable, hi,1990, is named school_lag
iv) the control, xi,1994 is named fiscal
Finally, I create year dummies (years*)
For the Two-Step Diff-GMM (and collapsing the instruments),
xtdpdgmm gdp_growth gdp_lag school_lag fiscal years*, model(diff) collapse gmm( fiscal, lag(2 3)) gmm(gdp_lag, lag(1 2)) gmm(school_lag years*, lag(0 0) ) nocons two vce(r) nolog
where:
i) the first difference of fiscal (Δxit) is instrumented by lags 2 and 3 of levels of xit;
ii) gdp_lag (yi,t-1 ) is endogenous and in the first differenced equation it is instrumented by lag 1 and 2 levels (which correspond to lag 2 and 3 levels of yi,t) ;
iii) school_lag is predetermined and since enters at lag 1 in the equation to estimate, it turns out to be exogenous and in the first differences equation, the first difference of school_lag shall be instrumented on itself
iv) For the time dummy, I guess it shall be added in that way.
In this case,only one year dummy is dropped (I was expecting three year dummies to be dropped).
For the Two-Step SYS-GMM (and collapsing instruments),
xtdpdgmm gdp_growth gdp_lag school_lag fiscal years*, model(diff) collapse gmm( fiscal, lag(2 3)) gmm(gdp_lag, lag(1 2)) gmm(school_lag years*, lag(0 0) ) gmm( fiscal , lag(1 1) diff model(level)) gmm(gdp_lag school_lag years*, lag(0 0) diff model(level)) nocons two vce(r)
where:
i) the lag one first difference of fiscal (Δxit-1) is used as instruments in the level equation for xit, respectively ;
ii) the lag 1 first difference of gdp_lag (Δyi,t-1 ) and school ( Δhit-1) are used as instruments for yi,t-1 and hit-1, respectively.
iii) For the time dummy, I guess it shall be added in that way
However, for the SYS-GMM estimator, none of the year dummies are dropped.
I suspect there is something wrong in my coding, and maybe the way I've arranged the dataset is problematic.
Thanks so much for your help in advance!
I'm trying to estimate the effect of public spending of a certain kind on the real GDP per capita growth rate. I'm trying to use the xtdpdgmm command to perform the Two-Step Diff and Two-Step Sys GMM, but I'm not entirely sure I understand the syntax fully.
Let me explain how I've arranged the dataset, otherwise, the syntax I'm going to show you won’t be clear. I have a balanced panel dataset with 35 countries (unfortunately N is not particularly large) and 25 years, and I use five-year non-overlapping averages, resulting in 5 observations per country in periods 1990-1994, 1995-1999,...,2010-2014.
The equation to estimate is:
yit - yi,t-x = (β1 - 1) yi,t-x + β2 hi,t-x + β3 xit + αi + δt + uit
where t=1994,1999,2004,2009,2014 and x=5, except for the first period where x=4. Further, hit is assumed to be predetermined and xit = Mean of xi from time t-x+1 to time t. For instance, at t=1999, xi1999 denotes the average of xi from year 1995 to year 1999.
For each country i, my dataset in stata has 5 rows (all full), and the first row for country i, has the following columns:
i) the dependent variable (yi,1994 - yi,1990 ) is named gdp_growth
ii) the AR part, yi,1990, is named gdp_lag
iii) the predetermined variable, hi,1990, is named school_lag
iv) the control, xi,1994 is named fiscal
Finally, I create year dummies (years*)
For the Two-Step Diff-GMM (and collapsing the instruments),
xtdpdgmm gdp_growth gdp_lag school_lag fiscal years*, model(diff) collapse gmm( fiscal, lag(2 3)) gmm(gdp_lag, lag(1 2)) gmm(school_lag years*, lag(0 0) ) nocons two vce(r) nolog
where:
i) the first difference of fiscal (Δxit) is instrumented by lags 2 and 3 of levels of xit;
ii) gdp_lag (yi,t-1 ) is endogenous and in the first differenced equation it is instrumented by lag 1 and 2 levels (which correspond to lag 2 and 3 levels of yi,t) ;
iii) school_lag is predetermined and since enters at lag 1 in the equation to estimate, it turns out to be exogenous and in the first differences equation, the first difference of school_lag shall be instrumented on itself
iv) For the time dummy, I guess it shall be added in that way.
In this case,only one year dummy is dropped (I was expecting three year dummies to be dropped).
For the Two-Step SYS-GMM (and collapsing instruments),
xtdpdgmm gdp_growth gdp_lag school_lag fiscal years*, model(diff) collapse gmm( fiscal, lag(2 3)) gmm(gdp_lag, lag(1 2)) gmm(school_lag years*, lag(0 0) ) gmm( fiscal , lag(1 1) diff model(level)) gmm(gdp_lag school_lag years*, lag(0 0) diff model(level)) nocons two vce(r)
where:
i) the lag one first difference of fiscal (Δxit-1) is used as instruments in the level equation for xit, respectively ;
ii) the lag 1 first difference of gdp_lag (Δyi,t-1 ) and school ( Δhit-1) are used as instruments for yi,t-1 and hit-1, respectively.
iii) For the time dummy, I guess it shall be added in that way
However, for the SYS-GMM estimator, none of the year dummies are dropped.
I suspect there is something wrong in my coding, and maybe the way I've arranged the dataset is problematic.
Thanks so much for your help in advance!
Comment