
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • 1. The condition T>=2 refers to a model where the initial observation is observed for period 0, i.e. effectively you need at least 3 time periods when the first-differenced lagged dependent variable is instrumented with the second lag of the dependent variable in levels.

    2. Almost everything you can do with xtabond2, you can also do with xtdpdgmm. Instrumenting for endogenous variables with the latter command works in a very similar way. Please see the help file or my 2019 London Stata Conference presentation:
    3. With a binary dependent variable, you can still estimate a linear regression model. This is then labelled a linear probability model. Again, no difference between xtabond2 and xtdpdgmm here.


    • Thank you so much for your reply.


      • Dear Sebastian,

        When working with a new computer and a new Stata installation (16.1), I found an error when running code:
        estat mmsc model1 model2
        There is an error:
        ngroups1 not found
        I usually use Stata in my own laptop, and the code works just fine.
        What do you think the problem in here?

        Thank you.


        • Thanks for flagging this bug. There was unfortunately a silly mistake I did in my last update.

          The updated version 2.3.3 that fixes this problem is now available on my website:
          adoupdate xtdpdgmm, update


          • Dear Sebastian,

            Thank you for the update.

            Now my problem is the difference result from using XTABOND2 and XTDPDGMM. So, I compare the commands:
            global var1 ="hf hfhf lgdp lgdp2 L(0/1).(hflgdp)"
            xtabond2 L(0/1).ls_fdi $var1 yr2004-yr2015, gmm(ls_fdi, lag(2 3) coll) gmm(hf, lag(2 5) coll) gmm(hfhf, lag(2 5) coll) gmm(hflgdp, lag(2 5) coll) gmm(lgdp, lag(2 5) coll) gmm(lgdp2, lag(2 5) coll) iv(yr2004-yr2015) artest(10) noleveleq twostep svmat robust
            xtdpdgmm L(0/1).ls_fdi $var1 yr2004-yr2015, gmm(ls_fdi, lag(2 3) coll) gmm(hf, lag(2 5) coll) gmm(hfhf, lag(2 5) coll) gmm(hflgdp, lag(2 5) coll) gmm(lgdp, lag(2 5) coll) gmm(lgdp2, lag(2 5) coll) iv(yr2004-yr2015) model(diff) twostep overid vce(robust)
            The estimates for the endogenous variables are the same (both coefficients and the standard error), but there are differences in (i) the estimates for year dummies and (ii) the AR1, AR2, and so on. I'd say I can ignore the difference in (i) but the difference in (ii) is significant.
            xtabond2 produces
            Arellano-Bond test for AR(1) in first differences: z =  -4.09  Pr > z =  0.000
            Arellano-Bond test for AR(2) in first differences: z =  -0.16  Pr > z =  0.871
            Arellano-Bond test for AR(3) in first differences: z =   0.49  Pr > z =  0.625
            Arellano-Bond test for AR(4) in first differences: z =   0.17  Pr > z =  0.868
            Arellano-Bond test for AR(5) in first differences: z =   0.09  Pr > z =  0.927
            Arellano-Bond test for AR(6) in first differences: z =   0.50  Pr > z =  0.616
            Arellano-Bond test for AR(7) in first differences: z =  -0.17  Pr > z =  0.862
            Arellano-Bond test for AR(8) in first differences: z =   0.09  Pr > z =  0.925
            Arellano-Bond test for AR(9) in first differences: z =  -0.31  Pr > z =  0.755
            Arellano-Bond test for AR(10) in first differences:z =   0.28  Pr > z =  0.776
            XTDPDGMM produces
            Arellano-Bond test for    autocorrelation    of the first-differenced residuals
            H0: no autocorrelation    of order 1:    z =   -0.0179   Prob > z  =    0.9857
            H0: no autocorrelation    of order 2:    z =   -0.0023   Prob > z  =    0.9982
            H0: no autocorrelation    of order 3:    z =    0.0046   Prob > z  =    0.9964
            H0: no autocorrelation    of order 4:    z =    0.0064   Prob > z  =    0.9949
            H0: no autocorrelation    of order 5:    z =    0.0011   Prob > z  =    0.9991
            H0: no autocorrelation    of order 6:    z =    0.0090   Prob > z  =    0.9928
            H0: no autocorrelation    of order 7:    z =         .   Prob > z  =         .
            H0: no autocorrelation    of order 8:    z =         .   Prob > z  =         .
            H0: no autocorrelation    of order 9:    z =         .   Prob > z  =         .
            H0: no autocorrelation    of order 10:    z =    0.0029   Prob > z  =    0.9977
            What does the cause of the difference and how can I get the same result using both xtabond2 and xtdpdgmm? The Hansen in both commands are the same though.
            Last edited by Tiyo Ardiyono; 17 Mar 2021, 22:04.


            • (i) The differences in the coefficients of the year dummies are possibly due to the different instrumentation. Note that the iv() option with xtabond2 automatically first-differences the year dummy instruments, while the same option with xtdpdgmm would require the suboption diff to do the first-differencing of the instruments. The coefficients might also differ if xtabond2 drops a different year dummy due to collinearity than xtdpdgmm does.

              (ii) The differences in the year dummy coefficients could possibly contribute to the differences in the test results. More importantly, it is actually a feature of xtdpdgmm that it produces different AR test results after the two-step estimator with the Windmeijer correction. Let me quote myself from the opening post of this thread:
              Originally posted by Sebastian Kripfganz View Post
              The results of the Arellano-Bond test differ slightly from xtdpd and xtabond2 for two-step robust estimators because I account for the finite-sample Windmeijer (2005) correction when computing the test statistic, while the existing commands do not.
              You should get the same test results with the one-step estimator, or with the two-step estimator but without the robust/vce(robust) options.


              • Dear Sebastian,

                What is the best way to judge p-values of Hansen J-test of the overidentifying restrictions? I found p value=0.15, how should I interpret this?

                Best regards,


                • If all overidentifying restrictions are indeed valid, in repeated random samples you would expect to see a more extreme value of the Hansen test in 15% of the cases, provided that the Hansen test is correctly sized.

                  The latter qualification can be an issue with dynamic panel data models, in particular if you have a relatively small sample size and many instruments, or if some of the instruments are weak. There is no consensus on what constitutes a good range of p-values that provides us with sufficient confidence in the correct model specification. See for example the following article:


                  • I see, this paper says that the perceived power of the test matters to judge p value of 0.15 for the Hansen test. If I revised my model to get higher p-value, which interval would provide at least safe zone?


                    • I cannot say much beyond what Jan Kiviet writes in his paper, and there is no safe zone that I feel confident laying out here. Eventually, it remains a matter of judgment depending on the particular data and application.


                      • Dear sir,

                        I want to apply the two-step system GMM to investigate the impact of ownership concentration on the CEO pay-performance relationship with 201 firms for 5 years of balanced panel data. I have applied the command given below. DV is TC; IDVs and control variables are ROEP T3 LFSIZE LFAGE LEV RISK CEOD BSZE IND_P; ROET3 is the interaction variable; ID* are 5 industry dummy variables and YD* are 4 year dummy variables

                        The results are not up to the mark- the p-value of the Hansen and Sargan test is very high; AR(1) and AR (2) both are insignificant and none of the coefficients are significant. I made some changes to the command like adding collapse to the equation to reduce number of instruments, changed the classification of variables from endogenous to exogenous but none worked.

                        Please suggest what can be done to meet all the assumptions along with retaining the significance of the coefficients.

                        xtdpdgmm TC L.TC ROEP ROET3 T3 LFSIZE LFAGE LEV RISK CEOD BSZE IND_P ID* YD*,twostep vce(cluster cid) gmmiv (L.TC, lag(0 0) collapse model (fodev)) gmmiv (ROEP, lag(0 1) collapse model (fodev)) gmmiv (ROET3, lag(0 1) collapse model (fodev)) gmmiv (T3, lag(0 1) collapse model (fodev)) gmmiv (LFSIZE, lag(0 1) collapse model (fodev)) gmmiv (LFAGE, lag(0 1) collapse model (fodev)) gmmiv (LEV, lag(0 1) collapse model (fodev)) gmmiv (RISK, lag(0 1) collapse model (fodev)) gmmiv (BSZE, lag(0 1) collapse model (fodev)) gmmiv (IND_P, lag(0 1) collapse model (fodev)) gmmiv (CEOD, lag(0 1) collapse model (fodev)) gmmiv (ID*, lag(0 0) collapse model (level)) gmmiv (YD*, lag(0 0) collapse model (level)) nofootnote
                        Generalized method of moments estimation
                        Fitting full model:
                        Step 1         f(b) =   380.3873
                        Step 2         f(b) =  .02331842
                        Group variable: cid                          Number of obs         =       804
                        Time variable: YEAR                          Number of groups      =       201
                        Moment conditions:     linear =      30      Obs per group:    min =         4
                                            nonlinear =       0                        avg =         4
                                                total =      30                        max =         4
                                                          (Std. Err. adjusted for 201 clusters in cid)
                                     |              WC-Robust
                                  TC |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                                  TC |
                                 L1. |   .0704489   .2947275     0.24   0.811    -.5072064    .6481041
                                ROEP |  -2.161121   7.234564    -0.30   0.765    -16.34061    12.01836
                               ROET3 |   .0486878   .1103199     0.44   0.659    -.1675352    .2649108
                                  T3 |  -1.268114   3.028676    -0.42   0.675    -7.204211    4.667982
                              LFSIZE |   -17.6046   79.81283    -0.22   0.825    -174.0349    138.8257
                               LFAGE |   121.1339   126.2982     0.96   0.338     -126.406    368.6738
                                 LEV |  -10.47428   151.6317    -0.07   0.945    -307.6669    286.7183
                                RISK |  -25.74973   86.25241    -0.30   0.765    -194.8013    143.3019
                                CEOD |  -70.64974   109.8793    -0.64   0.520    -286.0091    144.7096
                                BSZE |  -1.578545   3.629034    -0.43   0.664    -8.691321    5.534232
                               IND_P |  -1.513525    1.22047    -1.24   0.215    -3.905602    .8785514
                                 ID1 |   18.11663   97.57612     0.19   0.853     -173.129    209.3623
                                 ID2 |   7.156462   65.52682     0.11   0.913    -121.2737    135.5867
                                 ID3 |   16.69424   106.4915     0.16   0.875    -192.0252    225.4136
                                 ID4 |   -28.0001   83.33812    -0.34   0.737    -191.3398    135.3396
                                 ID5 |   42.28515   99.73981     0.42   0.672    -153.2013    237.7716
                                 YD1 |  -14.67081   27.28439    -0.54   0.591    -68.14724    38.80561
                                 YD2 |  -10.77153   18.99932    -0.57   0.571    -48.00952    26.46647
                                 YD3 |  -7.408057    9.57917    -0.77   0.439    -26.18288    11.36677
                                 YD4 |          0  (omitted)
                               _cons |   11.46227   760.6348     0.02   0.988    -1479.355    1502.279
                        . estat overid
                        Sargan-Hansen test of the overidentifying restrictions
                        H0: overidentifying restrictions are valid
                        2-step moment functions, 2-step weighting matrix       chi2(10)    =    4.6870
                                                                               Prob > chi2 =    0.9111
                        2-step moment functions, 3-step weighting matrix       chi2(10)    =    6.3643
                                                                               Prob > chi2 =    0.7838
                        . estat serial
                        Arellano-Bond test for autocorrelation of the first-differenced residuals
                        H0: no autocorrelation of order 1:     z =   -0.3265   Prob > |z|  =    0.7440
                        H0: no autocorrelation of order 2:     z =   -0.3837   Prob > |z|  =    0.7012

                        Thanks in advance!


                        • I am afraid I do not have a good answer. I notice that your standard errors are all very large which might be a consequence of weak instruments. You could try adding nonlinear moment conditions, e.g. option nl(noserial), although I am not sure if that will improve the situation.


                          • Dear Sir,

                            nl(noserial) option is not working

                            xtdpdgmm TC L.TC ROEP ROET3 T3 LFSIZE LFAGE LEV RISK CEOD BSZE IND_P ID* YD*,twostep vce(cluster cid) nl(noserial) gmmiv (L
                            > .TC, lag(0 0) collapse model (fodev)) gmmiv (ROEP, lag(0 1) collapse model (fodev)) gmmiv (ROET3, lag(0 1) collapse model (
                            > fodev)) gmmiv (T3, lag(0 1) collapse model (fodev)) gmmiv (LFSIZE, lag(0 1) collapse model (fodev)) gmmiv (LFAGE, lag(0 1)
                            > collapse model (fodev)) gmmiv (LEV, lag(0 1) collapse model (fodev)) gmmiv (RISK, lag(0 1) collapse model (fodev)) gmmiv (B
                            > SZE, lag(0 1) collapse model (fodev)) gmmiv (IND_P, lag(0 1) collapse model (fodev)) gmmiv (CEOD, lag(0 1) collapse model (
                            > fodev)) gmmiv (ID*, lag(0 0) collapse model (level))gmmiv (YD*, lag(0 0) collapse model (level)) nofootnote
                            Generalized method of moments estimation
                            Fitting full model:
                            Step 1:
                            initial:       f(b) =   18948478
                            alternative:   f(b) =   18911537
                            rescale:       f(b) =  7130192.3
                            Iteration 0:   f(b) =  7130192.3  (not concave)
                            Iteration 1:   f(b) =  949992.37  (not concave)
                            Iteration 2:   f(b) =  278578.86  (not concave)
                            Iteration 3:   f(b) =  151619.85  (not concave)
                            Iteration 4:   f(b) =  120342.38  (not concave)
                            Iteration 5:   f(b) =  97568.645  (not concave)
                            Iteration 6:   f(b) =  81329.831  (not concave)
                            Iteration 7:   f(b) =  70476.142  (not concave)
                            Iteration 8:   f(b) =  59107.138  (not concave)
                            Iteration 9:   f(b) =  52308.153  (not concave)
                            Iteration 10:  f(b) =  43053.096  (not concave)
                            Iteration 11:  f(b) =  33223.071  (not concave)
                            Iteration 12:  f(b) =  29835.377  (not concave)
                            Iteration 13:  f(b) =  16909.705  (not concave)
                            Iteration 14:  f(b) =  15002.974  (not concave)
                            Iteration 15:  f(b) =  14232.683  (not concave)
                            Iteration 16:  f(b) =  7718.8442  (not concave)
                            Iteration 17:  f(b) =  7532.1743  (not concave)
                            Iteration 18:  f(b) =  7366.5578  (not concave)
                            Iteration 19:  f(b) =  7217.5852  (not concave)
                            Iteration 20:  f(b) =  7078.5422  (not concave)
                            Iteration 21:  f(b) =  6947.9369  (not concave)
                            Iteration 22:  f(b) =  6826.4571  (not concave)
                            Iteration 23:  f(b) =  6711.9993  (not concave)
                            Iteration 24:  f(b) =  6604.6715  (not concave)
                            Iteration 25:  f(b) =  6503.2638  (not concave)
                            Iteration 26:  f(b) =  6408.0337  (not concave)
                            Iteration 27:  f(b) =  6317.8641  (not concave)
                            Iteration 28:  f(b) =  6232.9867  (not concave)
                            Iteration 29:  f(b) =  6152.4831  (not concave)
                            Iteration 30:  f(b) =  6076.5509  (not concave)
                            Iteration 31:  f(b) =  6004.4055  (not concave)
                            Iteration 32:  f(b) =  5936.2248  (not concave)
                            Iteration 33:  f(b) =  5871.3394  (not concave)
                            Iteration 34:  f(b) =  5809.9042  (not concave)
                            Iteration 35:  f(b) =  5751.3458  (not concave)
                            Iteration 36:  f(b) =  5695.8011  (not concave)
                            Iteration 37:  f(b) =  5642.7766  (not concave)
                            Iteration 38:  f(b) =   5592.393  (not concave)
                            Iteration 39:  f(b) =  5544.2241  (not concave)
                            Iteration 40:  f(b) =  5498.3769  (not concave)
                            Iteration 41:  f(b) =  5454.4817  (not concave)
                            Iteration 42:  f(b) =  5412.6337  (not concave)
                            Iteration 43:  f(b) =  5372.5111  (not concave)
                            Iteration 44:  f(b) =  5334.1987  (not concave)
                            Iteration 45:  f(b) =  5297.4154  (not concave)
                            Iteration 46:  f(b) =  5262.2372  (not concave)
                            Iteration 47:  f(b) =  5228.4177  (not concave)
                            Iteration 48:  f(b) =  5196.0252  (not concave)
                            Iteration 49:  f(b) =  5164.8429  (not concave)
                            Iteration 50:  f(b) =  5134.9323  (not concave)
                            Iteration 51:  f(b) =  5106.1022  (not concave)
                            Iteration 52:  f(b) =  5078.4083  (not concave)
                            Iteration 53:  f(b) =  5051.6812  (not concave)
                            Iteration 54:  f(b) =  5025.9715  (not concave)
                            Iteration 55:  f(b) =  5001.1288  (not concave)
                            Iteration 56:  f(b) =  4977.1991  (not concave)
                            Iteration 57:  f(b) =  4954.0487  (not concave)
                            Iteration 58:  f(b) =  4931.7194  (not concave)
                            Iteration 59:  f(b) =  4910.0918  (not concave)
                            Iteration 60:  f(b) =  4889.2043  (not concave)
                            Iteration 61:  f(b) =  4868.9499  (not concave)
                            Iteration 62:  f(b) =  4849.3638  (not concave)
                            Iteration 63:  f(b) =  4830.3502  (not concave)
                            Iteration 64:  f(b) =  4811.9413  (not concave)
                            Iteration 65:  f(b) =  4794.0509  (not concave)
                            Iteration 66:  f(b) =  4776.7087  (not concave)
                            Iteration 67:  f(b) =   4759.837  (not concave)
                            Iteration 68:  f(b) =  4743.4633  (not concave)
                            Iteration 69:  f(b) =  4727.5172  (not concave)
                            Iteration 70:  f(b) =  4712.0243  (not concave)
                            Iteration 71:  f(b) =  4696.9209  (not concave)
                            Iteration 72:  f(b) =  4682.2305  (not concave)
                            These are the results from the Random effects model applied and most of the variables are significant. Only after applying system GMM, I am getting insignificant results. Can you please suggest some solution on the basis of these results?

                            xtreg TC ROEP ROET3 T3 LFSIZE LFAGE LEV RISK CEOD BSZE IND_P ID1 ID2 ID3 ID4 ID5 YD1 YD2 YD3 YD4, re vce (cluster cid)
                            Random-effects GLS regression                   Number of obs     =      1,005
                            Group variable: cid                             Number of groups  =        201
                            R-sq:                                           Obs per group:
                                 within  = 0.2511                                         min =          5
                                 between = 0.4991                                         avg =        5.0
                                 overall = 0.4145                                         max =          5
                                                                            Wald chi2(19)     =     110.21
                            corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
                                                              (Std. Err. adjusted for 201 clusters in cid)
                                         |               Robust
                                      TC |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                                    ROEP |   11.81289   3.695961     3.20   0.001     4.568942    19.05684
                                   ROET3 |  -.1636423   .0533337    -3.07   0.002    -.2681745   -.0591101
                                      T3 |   .9268638   .5153631     1.80   0.072    -.0832294    1.936957
                                  LFSIZE |   21.30412   3.564797     5.98   0.000     14.31725    28.29099
                                   LFAGE |  -11.82151   7.810607    -1.51   0.130    -27.13002    3.486997
                                     LEV |  -20.22973   20.30682    -1.00   0.319    -60.03036     19.5709
                                    RISK |    1.05389   13.74748     0.08   0.939    -25.89068    27.99846
                                    CEOD |   23.02698   11.60022     1.99   0.047     .2909595      45.763
                                    BSZE |   3.099498   1.382514     2.24   0.025     .3898211    5.809175
                                   IND_P |   .9007423   .5228959     1.72   0.085    -.1241148    1.925599
                                     ID1 |  -47.37584   32.43792    -1.46   0.144     -110.953    16.20132
                                     ID2 |   3.698717   13.37388     0.28   0.782    -22.51361    29.91104
                                     ID3 |  -12.53249   14.97791    -0.84   0.403    -41.88865    16.82367
                                     ID4 |   2.917077   18.43619     0.16   0.874    -33.21719    39.05134
                                     ID5 |   30.07731   19.60296     1.53   0.125    -8.343787     68.4984
                                     YD1 |   1.671931   4.416635     0.38   0.705    -6.984515    10.32838
                                     YD2 |     1.6644   6.062894     0.27   0.784    -10.21865    13.54745
                                     YD3 |   4.077982    5.29307     0.77   0.441    -6.296245    14.45221
                                     YD4 |   15.10113   6.991724     2.16   0.031     1.397606    28.80466
                                   _cons |   -273.697   74.47474    -3.68   0.000    -419.6648   -127.7292
                                 sigma_u |  53.650479
                                 sigma_e |  56.558824
                                     rho |  .47362907   (fraction of variance due to u_i)


                            • The reason for the non-convergence with the nl(noserial) option is that the perfect collinearity among your time dummies. You need to manually drop one of those time dummies.

                              A random-effects (or fixed-effects) regression makes much stronger assumptions that effectively lead to much stronger instruments. In particular, all variables are assumed to be strictly exogenous.

                              You could start with a dynamic model that assumes all variables (other than the lagged dependent variable) being strictly exogenous and then relax this assumption for one variable after the other to see whether a particular variable is causing the trouble. I.e. start with the following specification:
                              xtdpdgmm TC L.TC ROEP ROET3 T3 LFSIZE LFAGE LEV RISK CEOD BSZE IND_P ID* YD2 YD3 YD4, twostep vce(cluster cid) collapse gmmiv(L.TC, lag(0 0) model(fodev)) gmmiv(ROEP ROET3 T3 LFSIZE LFAGE LEV RISK CEOD BSZE IND_P, lag(0 1) model (fodev)) gmmiv(ROEP ROET3 T3 LFSIZE LFAGE LEV RISK CEOD BSZE IND_P, lag(0 0) model(mdev)) iv(ID* YD2 YD3 YD4, model (level)) nofooter
                              The part in red are the extra instruments valid only under strict exogeneity.


                              • First of all, thank you for prompt response.

                                Sir, I have five years study period, so I have formed only four dummy variables (first year is taken as a base year) based on n-1 formula. Do I still need to drop one of the year dummies?

                                I am a beginner and facing a lot of problems while applying the system GMM for my thesis, so might ask some silly questions. Apologies in advance. I will follow your advice on the dynamic model. I hope it helps.

