Dear all,
I have been trying to estimate employment with a sys-gmm estimator and with data by the world input output database. that is, data on industry level.
The equation follows the common form:
Number of employees = constant + L1.Num of employees + L1.wage + L1.Capital + L1.ValueAdded + error term
I've tried to familiarize myself with GMM using the -webuse abdata- database that allows for the estimation of an equation similiar to the one above and that is also proposed in Roodman (2009). so, since I wanted to estimate an employment equation I thought that this approach would also work with different data. However, after running the regression, the outcome of the hansen-test of overid restrictions indicates that the instrument are not valid (see code below). I've run different variations of the model (changes in lags or endogenous variables), yet the hansen test always rejects the validity of my instruments. so I am confused why the typical example for GMM (employment data) seems not to work with industry level data.
can someone offer an explanation?
thanks! Thomas
I have been trying to estimate employment with a sys-gmm estimator and with data by the world input output database. that is, data on industry level.
The equation follows the common form:
Number of employees = constant + L1.Num of employees + L1.wage + L1.Capital + L1.ValueAdded + error term
I've tried to familiarize myself with GMM using the -webuse abdata- database that allows for the estimation of an equation similiar to the one above and that is also proposed in Roodman (2009). so, since I wanted to estimate an employment equation I thought that this approach would also work with different data. However, after running the regression, the outcome of the hansen-test of overid restrictions indicates that the instrument are not valid (see code below). I've run different variations of the model (changes in lags or endogenous variables), yet the hansen test always rejects the validity of my instruments. so I am confused why the typical example for GMM (employment data) seems not to work with industry level data.
can someone offer an explanation?
thanks! Thomas
Code:
xtabond2 ln_EMPE l.ln_EMPE l.ln_P_L_EMP l.ln_K l.ln_VA i.year , gmm( l.ln_EMPE l.ln_P_L_EMP l.ln_K l.ln_VA ,lag(3 5)) iv(i.year, equation > (level)) twostep robust artest(3) Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm. Warning: Two-step estimated covariance matrix of moments is singular. Using a generalized inverse to calculate optimal weighting matrix for two-step estimation. Difference-in-Sargan/Hansen statistics may be negative. Dynamic panel-data estimation, two-step system GMM ------------------------------------------------------------------------------ Group variable: ctry_indus~y Number of obs = 30700 Time variable : year Number of groups = 2196 Number of instruments = 178 Obs per group: min = 3 Wald chi2(19) = 215606.58 avg = 13.98 Prob > chi2 = 0.000 max = 14 ------------------------------------------------------------------------------ | Corrected ln_EMPE | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ln_EMPE | L1. | .9823159 .0147167 66.75 0.000 .9534718 1.01116 | ln_P_L_EMP | L1. | -.0401773 .0115665 -3.47 0.001 -.0628471 -.0175074 | ln_K | L1. | .0000743 .0080639 0.01 0.993 -.0157306 .0158793 | ln_VA | L1. | .0170818 .0139796 1.22 0.222 -.0103178 .0444814 | year | 2000 | 0 (empty) 2001 | 0 (omitted) 2002 | .0002993 .003692 0.08 0.935 -.0069369 .0075354 2003 | -.0012104 .0030508 -0.40 0.692 -.0071899 .0047691 2004 | .0031081 .0038363 0.81 0.418 -.0044109 .010627 2005 | .0085499 .0047215 1.81 0.070 -.000704 .0178037 2006 | .0140131 .0045831 3.06 0.002 .0050305 .0229958 2007 | .0170023 .0046812 3.63 0.000 .0078273 .0261773 2008 | .006865 .0055576 1.24 0.217 -.0040278 .0177577 2009 | -.0336384 .0058694 -5.73 0.000 -.0451422 -.0221346 2010 | -.0129387 .004947 -2.62 0.009 -.0226347 -.0032426 2011 | .0077797 .0045261 1.72 0.086 -.0010913 .0166506 2012 | -.0006394 .0052275 -0.12 0.903 -.0108851 .0096062 2013 | .0009348 .0046395 0.20 0.840 -.0081585 .0100281 2014 | .0124441 .0049524 2.51 0.012 .0027377 .0221506 | _cons | .0714095 .0232882 3.07 0.002 .0257656 .1170535 ------------------------------------------------------------------------------ Instruments for first differences equation GMM-type (missing=0, separate instruments for each period unless collapsed) L(3/5).(L.ln_EMPE L.ln_P_L_EMP L.ln_K L.ln_VA) Instruments for levels equation Standard 2000b.year 2001.year 2002.year 2003.year 2004.year 2005.year 2006.year 2007.year 2008.year 2009.year 2010.year 2011.year 2012.year 2013.year 2014.year _cons GMM-type (missing=0, separate instruments for each period unless collapsed) DL2.(L.ln_EMPE L.ln_P_L_EMP L.ln_K L.ln_VA) ------------------------------------------------------------------------------ Arellano-Bond test for AR(1) in first differences: z = -10.11 Pr > z = 0.000 Arellano-Bond test for AR(2) in first differences: z = 0.37 Pr > z = 0.711 Arellano-Bond test for AR(3) in first differences: z = 0.69 Pr > z = 0.488 ------------------------------------------------------------------------------ Sargan test of overid. restrictions: chi2(158) = 833.81 Prob > chi2 = 0.000 (Not robust, but not weakened by many instruments.) Hansen test of overid. restrictions: chi2(158) = 345.94 Prob > chi2 = 0.000 (Robust, but weakened by many instruments.) Difference-in-Hansen tests of exogeneity of instrument subsets: GMM instruments for levels Hansen test excluding group: chi2(114) = 228.93 Prob > chi2 = 0.000 Difference (null H = exogenous): chi2(44) = 117.01 Prob > chi2 = 0.000 iv(2000b.year 2001.year 2002.year 2003.year 2004.year 2005.year 2006.year 2007.year 2008.year 2009.year 2010.year 2011.year 2012.year 201 > 3.year 2014.year, eq(level)) Hansen test excluding group: chi2(145) = 309.69 Prob > chi2 = 0.000 Difference (null H = exogenous): chi2(13) = 36.24 Prob > chi2 = 0.001
Comment