Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by Sebastian Kripfganz View Post
    I cannot replicate your problem. With the following example, predict gives me exactly the same predicted values as when I calculate them manually:
    Code:
    . webuse abdata
    . xtdpdgmm L(0/1).n w k, gmm(L.n w k, l(0 3) c m(fod)) gmm(L.n w k, l(0 0) d c m(level)) two vce(r) teffects
    . predict yhat if e(sample)
    . gen yhat_manual = _b[L1.n] * L1.n + _b[w] * w + _b[k] * k + _b[1978.year] * 1978.year + _b[1979.year] * 1979.year + _b[1980.year] * 1980.year + _b[1981.year] * 1981.year + _b[1982.year] * 1982.year + _b[1983.year] * 1983.year + _b[1984.year] * 1984.year + _b[_cons] if e(sample)
    Code:
    . sum yhat yhat_manual
    
    Variable | Obs Mean Std. Dev. Min Max
    -------------+---------------------------------------------------------
    yhat | 891 1.043574 1.426621 -2.375833 5.083724
    yhat_manual | 891 1.043574 1.426621 -2.375833 5.083724
    Did you make sure that the coefficients of the year dummies are only added to the predictions of the respective year?
    In relation to the command you mention, I have a few doubts.

    1. The lag length starts from 0 in both instances. Is there any special reason for it? What difference would it make if we start it from 1 instead? When we use 'fodev', what should be the starting point for lag length ideally?
    2. I notice that you us gmm() twice whereas gmmiv() has not been used at all. As gmmiv() is used to specify endogenous variables, should I assume there we no endogenous variable in your model?
    3. In the second gmm(), the lag length starts and ends at 0. What is the significance of this when lag length starts and ends at the same point?

    Thanks!

    Comment


    • Originally posted by Prateek Bedi View Post

      In relation to the command you mention, I have a few doubts.P

      1. The lag length starts from 0 in both instances. Is there any special reason for it? What difference would it make if we start it from 1 instead? When we use 'fodev', what should be the starting point for lag length ideally?
      2. I notice that you us gmm() twice whereas gmmiv() has not been used at all. As gmmiv() is used to specify endogenous variables, should I assume there we no endogenous variable in your model?
      3. In the second gmm(), the lag length starts and ends at 0. What is the significance of this when lag length starts and ends at the same point?

      Thanks!
      Please ignore point #2. It is irrelevant. Instead, I just wanted to know why have we used gmm() twice in the same command?

      Comment


      • The first gmm() option (which is an abbreviation for gmmiv()) refers to the GMM-style instruments for the model in forward-orthogonal deviations, m(fod), while the second gmm() option refers to the instruments for the model in levels, m(level).

        Please have a look at my 2019 London Stata Conference presentation. Slide 67 tells you the admissible lags under forward-orthogonal deviations. For strictly exogenous regressors (which I implicitly assumed here for w and k), any lag can be used. For predetermined regressors (L.n), lag 0 is the first admissible lag. In my specification, I assumed indeed that there are no endogenous variables (with respect to the idiosyncratic error term) in the model.

        For the level model, slide 31 tells you the additionally available instruments. For strictly exogenous and predetermined regressors, lag 0 of the first-differenced instruments is usually used (because it has the strongest correlation with the instrumented variables compared to any other lag). It is common practice not to use multiple lags as instruments for the level model (based on the idea that further lags would be redundant if all available lags were used for the transformed model).
        https://twitter.com/Kripfganz

        Comment


        • Thanks a lot, Prof Sebastian for your helpful guidance, once again! Really appreciate!

          Comment


          • Hello, I have a question about how to treat unit root data. I am currently estimating a GMM model in differences like the following one
            Code:
            xtdpdgmm L(0/1).n w k ys*, model(difference) gmm(L.n w) iv(k ys*, d) nocons vce(r)
            My issue is that the key independent variable of interest (w) is a unit root, but d.w is stationary.
            My question is: am I correct to run the model with the simple w, as it differences the data automatically or should I run the following model

            Code:
            xtdpdgmm L(0/1).n d.w k ys*, model(difference) gmm(L.n d.w) iv(k ys*, d) nocons vce(r)

            thanks a lot for your help!









            Comment


            • If your underlying theory requires w to enter in levels, then you should not transform this variable. Instead, it might be a good idea to also add L.w as another regressor. The data can then speak for itself, i.e. if the estimated coefficients of w and L.w are about the same with opposite signs, then this would be equivalent to directly estimating the model with D.w.

              In any case, lagged differences of w might be weak instruments.
              https://twitter.com/Kripfganz

              Comment


              • Hello, I have a doubt about my model, and I was wondering whether perhaps you could help me understand if there are some imprecisions. I aim to run the equivalent models using the difference one-step, two-step and the iterated GMM, to make sure that my estimates are not dependent on the estimation method. I am using xtabond2 for the one-step and two-step estimation and xtdpdgmm for the iterated model.

                the model I am estimating are the following:
                Code:
                * the one step GMM
                xi: xtabond2 y l.y l.x $controls_lag yeardum*, gmm(y x $controls, lag(2 5) collapse) iv(yeardum*) noleveleq small noconstant robust 
                
                * the two step GMM
                xi: xtabond2 y l.y l.x $controls_lag yeardum*, gmm(y x $controls, lag(2 5) collapse) iv(yeardum*) noleveleq small two noconstant robust 
                
                * Iterated
                xi: xtdpdgmm L(0/1).y l.x $controls, model(diff) collapse gmm(y x  $controls, lag(2 5) collapse) igmm vce(r) small noconstant teffects igmmiterate(100)
                I am not sure that these model specifications are equivalent because while the one-step and two-step GMM have 1627 observations, the iterated model uses 1723. Is this normal, or without realizing I am miss specifying the model?

                Thank you a lot in advance for your help. I would be more than grateful for any guidance you could provide

                Best









                Comment


                • * Correction to previous post: the formula I am using for the iterated model is the following
                  Code:
                  xi: xtdpdgmm L(0/1).y l.x $controls_lag, model(diff) collapse gmm(y x $controls, lag(2 5) collapse) igmm vce(r) small noconstant teffects igmmiterate(100)








                  Comment


                  • The model specifications appear to be equivalent. xtabond2 reports the number of observations for the first-differenced model while xtdpdgmm always reports the number of observations for the untransformed levels model. There is 1 observation less per group in the first-differenced model.
                    https://twitter.com/Kripfganz

                    Comment


                    • Great Thank you very for this super quick reply and for the explanation









                      Comment


                      • Hello,
                        I have a question about the use of time fixed effects in the xtdpdgmm command. I am running the following one step difference GMM.
                        Code:
                         xtdpdgmm L(0/1).y l.x $controls_lag yeardum*, model(difference) gmm(y x $controls, lag(2 5)) iv(yeardum*) collapse nocons vce(r) one
                        
                        * which provides identical estimates to the xtabond2 estimation, even thought standard errors are sligthly different.
                        
                        xi: xtabond2 y l.y l.x $controls_lag yeardum*, gmm(y x $controls, lag(2 5) collapse) iv(yeardum*) noleveleq small noconstant robust
                        However, if I remove the inclusion of year dummies from the model, and use the teffects my estimates change significantly.
                        Why is this the case in your opinion? is there misspecification in the usage of teffects?

                        Code:
                         xtdpdgmm L(0/1).y l.x $controls_lag, model(difference) gmm(y x $controls, lag(2 5)) collapse nocons vce(r) one teffects
                        I am attaching a photo of the results of the 3 models calculated in the specified order. (The 4th row is the autoregressive parameter.)


                        Click image for larger version

Name:	Schermata 2020-02-11 alle 15.15.49.jpg
Views:	1
Size:	36.4 KB
ID:	1536026


                        Thank you very much in advance for taking the time to answer these very applied questions. I ask them because I want to migrate and use more xtdpdgmm relative to xtabond2, in my research and I want to make sure that I am doing it correctly.










                        Comment


                        • Standard errors seem to be different because you did not specify the small option with xtdpdgmm.

                          The teffects option always creates instruments for the time dummies in the levels model even if you specify model(difference)! While there is nothing wrong with that given that time dummies are exogenous, it may sometimes be preferable to specify those instruments for the differenced model, in particular if it is your aim to estimate the model with the one-step difference-GMM estimator.
                          https://twitter.com/Kripfganz

                          Comment


                          • Hello,

                            I am working on corporate cash holdings using data for 1696 firms over the period 2001-16. I ran two regressions with the following commands and outputs (only the lag ranges differ for these two regressions, rest of the command is same for both regressions). In the first regression, the variable 'Dividend2' (which is a dummy variable and takes value of 1 when the firm pays dividend in a particular year) is omitted and in the second regression, 'constant' is omitted'. In fact, either of these independent variables is omitted when I run these commands with different lag ranges. I am not able to figure out why only one of these two independent variables (Dividend2 or constant) gets reported and the other is omitted.

                            Code:
                            xtdpdgmm CashHoldings2 L.CashHoldings2 Size1 Leverage1 Liquidity2 Profitability4 GrowthPotential2 OperatingCashflow Dividend2 CapitalExpenditure1 CashFlowVol15years WPromoterSharesin1 c.SIR2#c.L.CashHoldings2 if ExcessCashDummy2==1, teffects twostep vce(cluster CompanyID) gmmiv(L.CashHoldings2, lag(1 8) coll  model(fodev)) gmmiv(Leverage1 Liquidity2 GrowthPotential2 Dividend2 CapitalExpenditure1 c.SIR2#c.L.CashHoldings2, lag(1 6) coll model(fodev)) iv( Year* Size1 Profitability4  WPromoterSharesin1 CashFlowVol15years OperatingCashflow, model(level)) nofootnote
                            Code:
                            Group variable: CompanyID                    Number of obs         =      3067
                            Time variable: Year                          Number of groups      =       630
                            
                            Moment conditions:     linear =      64      Obs per group:    min =         1
                                                nonlinear =       0                        avg =  4.868254
                                                    total =      64                        max =        14
                            
                                                                   (Std. Err. adjusted for 630 clusters in CompanyID)
                            -----------------------------------------------------------------------------------------
                                                    |              WC-Robust
                                      CashHoldings2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                            ------------------------+----------------------------------------------------------------
                                      CashHoldings2 |
                                                L1. |   .9202039   .2110783     4.36   0.000      .506498     1.33391
                                                    |
                                              Size1 |   -.000045   .0000642    -0.70   0.483    -.0001709    .0000808
                                          Leverage1 |   .0051676   .0012412     4.16   0.000     .0027349    .0076002
                                         Liquidity2 |  -.0015128   .0012105    -1.25   0.211    -.0038853    .0008598
                                     Profitability4 |   .0116191   .0023033     5.04   0.000     .0071047    .0161335
                            GrowthPotential2TobinsQ |  -.0009329   .0003077    -3.03   0.002     -.001536   -.0003297
                                  OperatingCashflow |   .0048039   .0008251     5.82   0.000     .0031867    .0064211
                                          Dividend2 |          0  (omitted)
                                CapitalExpenditure1 |  -.0071506   .0028852    -2.48   0.013    -.0128055   -.0014958
                                 CashFlowVol15years |   .0178048   .0019959     8.92   0.000      .013893    .0217166
                                 WPromoterSharesin1 |  -.0013006   .0004333    -3.00   0.003    -.0021498   -.0004514
                                                    |
                            c.SIR2#cL.CashHoldings2 |  -5.289581   3.051242    -1.73   0.083    -11.26991    .6907433
                                                    |
                                               Year |
                                              2003  |   .0000116   .0002457     0.05   0.962      -.00047    .0004932
                                              2004  |   .0000302   .0002628     0.11   0.909    -.0004849    .0005453
                                              2005  |   .0006248   .0003621     1.73   0.084     -.000085    .0013345
                                              2006  |    .000956   .0003532     2.71   0.007     .0002637    .0016483
                                              2007  |   .0011733   .0003632     3.23   0.001     .0004615    .0018851
                                              2008  |   .0005805   .0002797     2.08   0.038     .0000323    .0011286
                                              2009  |   .0002752   .0002446     1.13   0.260    -.0002041    .0007545
                                              2010  |   .0007216   .0004453     1.62   0.105    -.0001511    .0015943
                                              2011  |   .0000444   .0003437     0.13   0.897    -.0006292    .0007179
                                              2012  |   .0001928   .0002958     0.65   0.515     -.000387    .0007727
                                              2013  |  -.0004563   .0002832    -1.61   0.107    -.0010113    .0000987
                                              2014  |   .0003354   .0003359     1.00   0.318     -.000323    .0009938
                                              2015  |   .0010732   .0004351     2.47   0.014     .0002205    .0019259
                                              2016  |   .0014033   .0004847     2.90   0.004     .0004533    .0023532
                                                    |
                                              _cons |   .0006629   .0008362     0.79   0.428    -.0009761    .0023018
                            -----------------------------------------------------------------------------------------
                            Code:
                            xtdpdgmm CashHoldings2 L.CashHoldings2 Size1 Leverage1 Liquidity2 Profitability4 GrowthPotential2 OperatingCashflow Dividend2 CapitalExpenditure1 CashFlowVol15years WPromoterSharesin1 c.SIR2#c.L.CashHoldings2 if ExcessCashDummy2==1, teffects twostep vce(cluster CompanyID) gmmiv(L.CashHoldings2, lag(1 1) coll  model(fodev)) gmmiv(Leverage1 Liquidity2 GrowthPotential2 Dividend2 CapitalExpenditure1 c.SIR2#c.L.CashHoldings2, lag(1 6) coll model(fodev)) iv( Year* Size1 Profitability4  WPromoterSharesin1 CashFlowVol15years OperatingCashflow, model(level)) nofootnote
                            Code:
                            Group variable: CompanyID                    Number of obs         =      3067
                            Time variable: Year                          Number of groups      =       630
                            
                            Moment conditions:     linear =      57      Obs per group:    min =         1
                                                nonlinear =       0                        avg =  4.868254
                                                    total =      57                        max =        14
                            
                                                                   (Std. Err. adjusted for 630 clusters in CompanyID)
                            -----------------------------------------------------------------------------------------
                                                    |              WC-Robust
                                      CashHoldings2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                            ------------------------+----------------------------------------------------------------
                                      CashHoldings2 |
                                                L1. |   .9731027   .2012846     4.83   0.000     .5785921    1.367613
                                                    |
                                              Size1 |  -.0000497   .0000688    -0.72   0.470    -.0001845     .000085
                                          Leverage1 |    .004433   .0013046     3.40   0.001      .001876    .0069899
                                         Liquidity2 |  -.0010752    .001213    -0.89   0.375    -.0034527    .0013023
                                     Profitability4 |   .0097248   .0025618     3.80   0.000     .0047037     .014746
                            GrowthPotential2TobinsQ |  -.0007632   .0003483    -2.19   0.028    -.0014458   -.0000806
                                  OperatingCashflow |   .0045713   .0008541     5.35   0.000     .0028974    .0062453
                                          Dividend2 |   .0008311   .0008739     0.95   0.342    -.0008817    .0025439
                                CapitalExpenditure1 |  -.0056351   .0031341    -1.80   0.072    -.0117778    .0005076
                                 CashFlowVol15years |    .017407   .0020669     8.42   0.000     .0133559     .021458
                                 WPromoterSharesin1 |   -.001282   .0004315    -2.97   0.003    -.0021278   -.0004363
                                                    |
                            c.SIR2#cL.CashHoldings2 |  -5.954595   2.924771    -2.04   0.042    -11.68704   -.2221496
                                                    |
                                               Year |
                                              2003  |  -.0000296   .0002446    -0.12   0.904     -.000509    .0004497
                                              2004  |  -.0000637    .000269    -0.24   0.813    -.0005909    .0004635
                                              2005  |   .0004414   .0003817     1.16   0.248    -.0003069    .0011896
                                              2006  |   .0007825   .0003749     2.09   0.037     .0000476    .0015173
                                              2007  |   .0009998   .0003916     2.55   0.011     .0002323    .0017674
                                              2008  |   .0004919   .0002916     1.69   0.092    -.0000796    .0010633
                                              2009  |   .0001564   .0002715     0.58   0.565    -.0003757    .0006885
                                              2010  |   .0005933   .0004476     1.33   0.185    -.0002839    .0014706
                                              2011  |  -.0000429   .0003644    -0.12   0.906    -.0007572    .0006714
                                              2012  |   .0001835   .0003093     0.59   0.553    -.0004227    .0007898
                                              2013  |  -.0004763   .0002945    -1.62   0.106    -.0010536     .000101
                                              2014  |   .0002245   .0003674     0.61   0.541    -.0004956    .0009446
                                              2015  |   .0008169   .0004961     1.65   0.100    -.0001555    .0017892
                                              2016  |   .0011492   .0005454     2.11   0.035     .0000803    .0022181
                                                    |
                                              _cons |          0  (omitted)
                            -----------------------------------------------------------------------------------------
                            Thanks!


                            Comment


                            • My guess is that there seems to be a problem of perfect colinearity (of Dividend2?) with the time dummies. This is not currently checked by the program. You might have to specify the time dummies manually instead using the teffects option (with dummies form 2004 to 2016 only), although the Dividend2 coefficient in that case might still be difficult to interpret because it essentially would just capture the omitted time effect.
                              https://twitter.com/Kripfganz

                              Comment


                              • Originally posted by Sebastian Kripfganz View Post
                                My guess is that there seems to be a problem of perfect colinearity (of Dividend2?) with the time dummies. This is not currently checked by the program. You might have to specify the time dummies manually instead using the teffects option (with dummies form 2004 to 2016 only), although the Dividend2 coefficient in that case might still be difficult to interpret because it essentially would just capture the omitted time effect.
                                You are right, Prof. Kripfganz. Thanks a lot for your valuable response!

                                Comment

                                Working...
                                X