Running GMM on growth (already first-differenced) variables

Cati Sacerdote

Join Date: Apr 2022
Posts: 3

Running GMM on growth (already first-differenced) variables

04 Apr 2022, 01:50

Good morning all,

I am running a dynamic panel looking at the relationship between terms of trade growth and real GDP growth (including more controls and country and year fixed effects), using N=120 and T=40. The simple model I'm looking to run is:

My main question is about whether I should run difference GMM on the already-differenced variables or if I need to run it on the levels (which gives strange results), and whether you have any other insights on the models.

The simple specification would involve OLS with fixed effects as T is large. Running the specification using log levels

HTML Code:

reg lncgdp L.lncgdp lntot i.country i.year, r

gives very high coefficients on the lagged dependent variable and the R squared is 0.9993, pointing that the model is probably misspecified and I should be using log differences instead - which is more what my interest is anyway. See below (I remove the coefficients on the fixed effects from the codes for ease of reading)

Code:

reg lncgdp L.lncgdp lntotx i.country i.year, r

Linear regression                               Number of obs     =      2,074
                                                F(92, 1981)       =   56427.43
                                                Prob > F          =     0.0000
                                                R-squared         =     0.9993
                                                Root MSE          =     .04408

-------------------------------------------------------------------------------------------
                          |               Robust
                   lncgdp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
--------------------------+----------------------------------------------------------------
                   lncgdp |
                      L1. |   .9829976   .0056776   173.14   0.000      .971863    .9941322
                          |
                   lntotx |  -.0001803   .0045674    -0.04   0.969    -.0091378    .0087772

If I use the first difference of GDP and TOT, this would become

Code:

reg D.lncgdp L.D.lncgdp D.lntot i.country i.year, r

However, using log differences does not allow me to use OLS with FE as the regressors will be correlated with the error term by construction. This is it, however:

Code:

reg dcgdp L.dcgdp dtotx i.country i.year, r

Linear regression                               Number of obs     =      2,027
                                                F(91, 1935)       =       6.57
                                                Prob > F          =     0.0000
                                                R-squared         =     0.1893
                                                Root MSE          =     .04268

-------------------------------------------------------------------------------------------
                          |               Robust
                    dcgdp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
--------------------------+----------------------------------------------------------------
                    dcgdp |
                      L1. |   .2230587   .0397349     5.61   0.000     .1451309    .3009865
                          |
                     dtotx |   .0347277    .043905     0.79   0.429    -.0513784    .1208338

I would like to use Anderson-Hsiao IV (with L2.lngdp as an instrument) or difference GMM in this case, but I am not sure what the right way to implement difference GMM is in Stata: would I run it with D.lncgdp and D.lntot or lncgdp and lntot as the GMM estimation on its own takes first differences? How would my coefficients be interpreted in that case? Running it with lngdp and lntot still gives a very high coefficient on the lagged dependent variable which worries me + what I am looking for is actually the relationship between GDP and TOT growth. Using D.lncgdp and D.lntot gives more healthy-looking estimates.
This is what

Code:

xtabond2 lncgdp L.lncgdp lntotx i.country i.year, gmm(L.lncgdp) noleveleq r

gives me:

Code:

xtabond2 lncgdp L.lncgdp lntotx i.country i.year, gmm(L.lncgdp) noleveleq r
Warning: Number of instruments may be large relative to number of observations.
Warning: Two-step estimated covariance matrix of moments is singular.
  Using a generalized inverse to calculate robust weighting matrix for Hansen test.
  Difference-in-Sargan/Hansen statistics may be negative.

Dynamic panel-data estimation, one-step difference GMM
------------------------------------------------------------------------------
Group variable: country                         Number of obs      =      2027
Time variable : year                            Number of groups   =        47
Number of instruments = 990                     Obs per group: min =        32
Wald chi2(0)  =         .                                      avg =     43.13
Prob > chi2   =         .                                      max =        44
------------------------------------------------------------------------------
             |               Robust
      lncgdp | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
      lncgdp |
         L1. |   .9499016   .0133067    71.39   0.000      .923821    .9759823
             |
      lntotx |  -.0150696   .0099702    -1.51   0.131    -.0346108    .0044716
------------------------------------------------------------------------------
Instruments for first differences equation
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/45).L.lncgdp
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -4.39  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =  -1.30  Pr > z =  0.194
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(944)  =1370.11  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(944)  =   2.61  Prob > chi2 =  1.000
  (Robust, but weakened by many instruments.)

Running GMM on the already-differenced variables with the level as the instrument, then, gives a more similar coefficient on the lagged variable to the OLS one:

Code:

xtabond2 dcgdp L.dcgdp dtotx i.country i.year, gmm(L.lncgdp) noleveleq r

Dynamic panel-data estimation, one-step difference GMM
------------------------------------------------------------------------------
Group variable: country                         Number of obs      =      1980
Time variable : year                            Number of groups   =        47
Number of instruments = 989                     Obs per group: min =        31
Wald chi2(0)  =         .                                      avg =     42.13
Prob > chi2   =         .                                      max =        43
------------------------------------------------------------------------------
             |               Robust
       dcgdp | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       dcgdp |
         L1. |   .2182032   .0443686     4.92   0.000     .1312423    .3051641
             |
       dtotx |  -.0018409   .0144273    -0.13   0.898    -.0301179    .0264362
             |
Instruments for first differences equation
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/45).L.lncgdp
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -4.98  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   0.75  Pr > z =  0.456
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(944)  =1028.39  Prob > chi2 =  0.029
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(944)  =   2.90  Prob > chi2 =  1.000
  (Robust, but weakened by many instruments.)

Using Anderson-Hsiao-style IV gives an absurdly high coefficient on the lagged variable (would imply that a 1% increase in the growth of GDP last year is associated with a 1.4% increase in the growth of GDP this year) and the centred R2 is -1.13..

Code:

 ivreg2 dcgdp dtotx i.country i.year (L.dcgdp = L2.lncgdp), r

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity

                                                      Number of obs =     2027
                                                      F( 91,  1935) =     2.82
                                                      Prob > F      =   0.0000
Total (centered) SS     =   4.34847292                Centered R2   =  -1.1282
Total (uncentered) SS   =  6.941141002                Uncentered R2 =  -0.3333
Residual SS             =   9.25437924                Root MSE      =   .06757

-------------------------------------------------------------------------------------------
                          |               Robust
                    dcgdp | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
--------------------------+----------------------------------------------------------------
                    dcgdp |
                      L1. |   1.444518   .4410191     3.28   0.001     .5801359    2.308899
                          |
                    dtotx |   .0299543   .0171802     1.74   0.081    -.0037182    .0636268
                          |
Underidentification test (Kleibergen-Paap rk LM statistic):             10.688
                                                   Chi-sq(1) P-val =    0.0011
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):               15.366
                         (Kleibergen-Paap rk Wald F statistic):         10.561
Stock-Yogo weak ID test critical values: 10% maximal IV size             16.38
                                         15% maximal IV size              8.96
                                         20% maximal IV size              6.66
                                         25% maximal IV size              5.53
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):         0.000
                                                 (equation exactly identified)
------------------------------------------------------------------------------
Instrumented:         L.dcgdp

Any thoughts?

Thank you!

Tags: None

Announcement

Running GMM on growth (already first-differenced) variables