Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Alright, Prof. Kripfganz. Thanks for this clarification. I think I should rely on theoretical arguments to take a call regarding categorisation of variables. Nonetheless, it would be helpful if you can please mention the slide numbers which I should look at in your 2019 London Stata Conference presentation.

    Comment


    • Hi,

      Using xtdpdgmm, I am examining the influence of macroeconomic variables on the speed of adjustment of corporate cash holdings for an unbalanced panel dataset of 1696 firms over the period 2001-16. In order to capture the effect of a macroeconomic variable (say GDP) on speed of adjustment of corporate cash holdings, I use an interaction term concerning GDP and lagged cash holdings. The model is given below.

      Code:
      xtdpdgmm CashHoldings1 L.CashHoldings1 Size1 Leverage1 Liquidity1 GrowthPotential2 c.GDPGrowthRatein#c.L.CashHoldings1, teffects twostep vce(cluster CompanyID) gmmiv(L.CashHoldings1, lag(0 6) model(fodev)) gmmiv(c.GDPGrowthRatein#c.L.CashHoldings1, lag(1 1) model(fodev)) gmmiv(Leverage1, lag(1 1) model(fodev)) gmmiv(Liquidity1, lag(1 1) model(fodev)) gmmiv(GrowthPotential2, lag(1 1) model(fodev)) gmmiv(Size1, lag(1 1) coll model(fodev)) nofootnote
      As per my understanding, when we introduce an interaction term between X1 and X2 in a regression model, it is pertinent to include X1 and X2 individually as explanatory variables in order to avoid model misspecification and resultant bias in estimates. However, in my model, one of the variables involved in the interaction is GDP i.e. a cross-invariant variable. Since I include time dummies in my model to control for the effect of such macroeconomic variables, I am confused whether I should explicitly include GDP as an explanatory variable in my model. If I do include GDP, omitted variable bias may affect my estimates because GDP is indeed potentially correlated with other macroeconomic variables (which I do not include in my model). On the other hand, if I do not include GDP, I am sceptical about the validity of the coefficient of interaction term.

      Thanks!

      Comment


      • Originally posted by Prateek Bedi View Post
        I think I should rely on theoretical arguments to take a call regarding categorisation of variables. Nonetheless, it would be helpful if you can please mention the slide numbers which I should look at in your 2019 London Stata Conference presentation.
        Slides 90 and following.

        Originally posted by Prateek Bedi View Post
        As per my understanding, when we introduce an interaction term between X1 and X2 in a regression model, it is pertinent to include X1 and X2 individually as explanatory variables in order to avoid model misspecification and resultant bias in estimates. However, in my model, one of the variables involved in the interaction is GDP i.e. a cross-invariant variable. Since I include time dummies in my model to control for the effect of such macroeconomic variables, I am confused whether I should explicitly include GDP as an explanatory variable in my model. If I do include GDP, omitted variable bias may affect my estimates because GDP is indeed potentially correlated with other macroeconomic variables (which I do not include in my model). On the other hand, if I do not include GDP, I am sceptical about the validity of the coefficient of interaction term.
        As you correctly observed, you cannot include GDP as a regressor because it would be perfectly collinear with the time dummies. The effect of GDP is implicitly included in the time effects, thus there is no concern about the validity of the interaction term. However, you can not identify the marginal effect of GDP in this model. You can only identify the effect that GDP has on the effect of the other variable in the interaction term.
        https://twitter.com/Kripfganz

        Comment


        • Thanks a lot, Prof. Kripfganz for your response. Really appreciate.

          Comment


          • Thank you so much Proff. Kripfganz for helping us understand GMM and more.. I take liberty in asking some questions that hopefully, will mitigate autocorrelation problem in my models.
            Let y be the dependent variable, and I am interested in finding the relation between y and the interaction between x1 and x2 variables(inter_x1_x2). All the other variables (x3 onwards) are very standard control variables used in the literature. The data is from 1992-2015. The variable y is corporate social performance of firm, and I expect it to be correlated with prior social performance. _y* denotes all the year dummies.

            1.xtabond2 L(0/2).y c.L.x1##c.L.x2 L.(x3 x4 x5 x6 x7 x8 x9) _y*, gmmstyle(y, lag(3 .) collapse) gmmstyle(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8 x9, lag(2 .) collapse) ivstyle (_y* , equation(level) ) nodif twostep
            Dynamic panel-data estimation, two-step system GMM
            ------------------------------------------------------------------------------
            Group variable: idDummy Number of obs = 18682
            Time variable : year Number of groups = 2751
            Number of instruments = 250 Obs per group: min = 1
            Wald chi2(37) = 27722.08 avg = 6.79
            Prob > chi2 = 0.000 max = 20
            ------------------------------------------------------------------------------
            y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            y |
            L1. | .5661621 .0122728 46.13 0.000 .5421078 .5902164
            L2. | .1574914 .0131169 12.01 0.000 .1317828 .1831999
            |
            x1 |
            L1. | -.0194 .0085448 -2.27 0.023 -.0361475 -.0026526
            |
            x2 |
            L1. | .0040796 .0159145 0.26 0.798 -.0271123 .0352715
            |
            cL.x1#cL.x2 | .0053554 .0037109 1.44 0.149 -.0019179 .0126286

            Instruments for first differences equation
            GMM-type (missing=0, separate instruments for each period unless collapsed)
            L(2/21).(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8 x9) collapsed
            L(3/21).y collapsed
            Instruments for levels equation
            Standard
            _year1 _year2 _year3 _year4 _year5 _year6 _year7 _year8 _year9 _year10
            _year11 _year12 _year13 _year14 _year15 _year16 _year17 _year18 _year19
            _year20 _year21 _year22 _year23 _year24 _year25
            _cons
            GMM-type (missing=0, separate instruments for each period unless collapsed)
            DL.(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8 x9) collapsed
            DL2.y collapsed
            ------------------------------------------------------------------------------
            Arellano-Bond test for AR(1) in first differences: z = -15.98 Pr > z = 0.000
            Arellano-Bond test for AR(2) in first differences: z = -11.22 Pr > z = 0.000
            ------------------------------------------------------------------------------
            Sargan test of overid. restrictions: chi2(212) =2036.97 Prob > chi2 = 0.000
            (Not robust, but not weakened by many instruments.)
            Hansen test of overid. restrictions: chi2(212) = 410.63 Prob > chi2 = 0.000
            (Robust, but weakened by many instruments.)


            The AR(2) test fails to confirm that there is no second order correlation, the Hansen test of overidentification fails too.


            2.xtdpdgmm L(0/2).y c.L.x1##c.L.x2 L.(x3 x4 x5 x6 x7 x8 x9) _y* , gmm(y, lag(3 .) collapse) gmm(x1 x2 x3 x4 x5 x6 x7 x8 x9 , lag(2 .) collapse) iv(_y*) twostep vce(robust)

            Group variable: idDummy Number of obs = 18682
            Time variable: year Number of groups = 2751

            Moment conditions: linear = 219 Obs per group: min = 1
            nonlinear = 0 avg = 6.790985
            total = 219 max = 20

            (Std. Err. adjusted for 2,751 clusters in idDummy)
            ------------------------------------------------------------------------------
            | WC-Robust
            y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            y |
            L1. | .7921867 .039236 20.19 0.000 .7152856 .8690879
            L2. | .1314568 .0368142 3.57 0.000 .0593024 .2036112
            |
            x1 |
            L1. | -.0079019 .0565286 -0.14 0.889 -.118696 .1028922
            |
            x2 |
            L1. | .0968249 .0173995 5.56 0.000 .0627226 .1309272
            |
            cL.x1#cL.x2 | .0423953 .0448359 0.95 0.344 -.0454814 .130272


            estat serial, ar(1/3)

            Arellano-Bond test for autocorrelation of the first-differenced residuals
            H0: no autocorrelation of order 1: z = -11.3669 Prob > |z| = 0.0000
            H0: no autocorrelation of order 2: z = -8.6879 Prob > |z| = 0.0000
            H0: no autocorrelation of order 3: z = 7.8412 Prob > |z| = 0.0000

            . estat overid

            Sargan-Hansen test of the overidentifying restrictions
            H0: overidentifying restrictions are valid

            2-step moment functions, 2-step weighting matrix chi2(187) = 330.4328
            Prob > chi2 = 0.0000

            2-step moment functions, 3-step weighting matrix chi2(187) = 321.9168
            Prob > chi2 = 0.0000


            3. When I omit x9, which is firm’s age, then:


            3.xtabond2 L(0/3).y c.L.x1##c.L.x2 L.(x3 x4 x5 x6 x7 x8 ) _y*, gmmstyle(y, lag(4 .) collapse) gmmstyle(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8 , lag(2 .) collapse) ivstyle(_y* , equation(level) ) nodif twostep

            Dynamic panel-data estimation, two-step system GMM
            ------------------------------------------------------------------------------
            Group variable: idDummy Number of obs = 15839
            Time variable : year Number of groups = 2411
            Number of instruments = 227 Obs per group: min = 1
            Wald chi2(37) = 10639.70 avg = 6.57
            Prob > chi2 = 0.000 max = 19
            ------------------------------------------------------------------------------
            y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            y |
            L1. | .6727764 .0169075 39.79 0.000 .6396383 .7059144
            L2. | -.4911847 .0380864 -12.90 0.000 -.5658326 -.4165368
            L3. | .687692 .0382205 17.99 0.000 .6127812 .7626028
            |
            x1 |
            L1. | -.0163344 .0102129 -1.60 0.110 -.0363513 .0036826
            |
            x2 |
            L1. | .0207565 .0182278 1.14 0.255 -.0149693 .0564822
            |
            cL.x1#cL.x2 | .004833 .0042823 1.13 0.259 -.0035602 .0132262


            Instruments for first differences equation
            GMM-type (missing=0, separate instruments for each period unless collapsed)
            L(2/21).(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8) collapsed
            L(4/21).y collapsed
            Instruments for levels equation
            Standard
            _year1 _year2 _year3 _year4 _year5 _year6 _year7 _year8 _year9 _year10
            _year11 _year12 _year13 _year14 _year15 _year16 _year17 _year18 _year19
            _year20 _year21 _year22 _year23 _year24 _year25
            _cons
            GMM-type (missing=0, separate instruments for each period unless collapsed)
            DL.(x1 x2 inter_x1_x2 x3 x4 x5 x6 x7 x8) collapsed
            DL3.y collapsed
            ------------------------------------------------------------------------------
            Arellano-Bond test for AR(1) in first differences: z = -16.76 Pr > z = 0.000
            Arellano-Bond test for AR(2) in first differences: z = 6.11 Pr > z = 0.000
            ------------------------------------------------------------------------------
            Sargan test of overid. restrictions: chi2(189) = 916.54 Prob > chi2 = 0.000
            (Not robust, but not weakened by many instruments.)
            Hansen test of overid. restrictions: chi2(189) = 300.72 Prob > chi2 = 0.000
            (Robust, but weakened by many instruments.)


            4. xtdpdgmm L(0/3).y c.L.x1##c.L.x2 L.(x3 x4 x5 x6 x7 x8 ) _y* , gmm(y, lag(4 .) collapse) gmm(x1 x2 x3 x4 x5 x6 x7 x8 , lag(2 .) collapse) iv(_y*) twostep vce(robust)

            Generalized method of moments estimation

            Fitting full model:
            Step 1 f(b) = .54287407
            Step 2 f(b) = .11860463

            Group variable: idDummy Number of obs = 15839
            Time variable: year Number of groups = 2411

            Moment conditions: linear = 197 Obs per group: min = 1
            nonlinear = 0 avg = 6.569473
            total = 197 max = 19

            (Std. Err. adjusted for 2,411 clusters in idDummy)
            ------------------------------------------------------------------------------
            | WC-Robust
            y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            y |
            L1. | .8571209 .051009 16.80 0.000 .757145 .9570967
            L2. | -.4125077 .0882957 -4.67 0.000 -.5855641 -.2394513
            L3. | .4672897 .082242 5.68 0.000 .3060983 .6284811
            |
            x1 |
            L1. | -.0745491 .0457311 -1.63 0.103 -.1641805 .0150823
            |
            x2 |
            L1. | .117317 .0207835 5.64 0.000 .0765821 .158052
            |
            cL.x1#cL.x2 | .0749956 .0477307 1.57 0.116 -.0185548 .168546


            estat serial, ar(1/3)

            Arellano-Bond test for autocorrelation of the first-differenced residuals
            H0: no autocorrelation of order 1: z = -10.3522 Prob > |z| = 0.0000
            H0: no autocorrelation of order 2: z = 1.4967 Prob > |z| = 0.1345
            H0: no autocorrelation of order 3: z = -0.7142 Prob > |z| = 0.4751

            estat overid

            Sargan-Hansen test of the overidentifying restrictions
            H0: overidentifying restrictions are valid

            2-step moment functions, 2-step weighting matrix chi2(166) = 285.9558
            Prob > chi2 = 0.0000

            2-step moment functions, 3-step weighting matrix chi2(166) = 291.3354
            Prob > chi2 = 0.0000

            The Hansen test still remains invalid.


            I tried lags from 2 to 10 for the dependent variable, with no success. It would be very helpful if you could guide me how I can identify the conditions that help me build acceptable models. I am unable to identify what am I missing here.

            Comment


            • You have a large number of observations, which is usually a good thing. However, with such many observations, the specification tests can already detect relatively small deviations from the null hypotheses.

              One kind of model misspecification could be that your regressors enter the equation all lagged. If there are contemporaneous effects, omitting them could lead to serial correlation in the error term and invalidity of the instruments. Alternatively, adding more lags of the X-regressors (or further interaction terms) might help as well, not just adding lags of the dependent variable.

              There is no simple solution that always works. For a general approach to model selection, please have a look at my 2019 London Stata Conference presentation, slides 90 onwards: and the paper by Jan Kiviet referenced therein.
              https://twitter.com/Kripfganz

              Comment


              • Thank you so much for replying. I included all the variables as lagged because I am looking at the effects of variables in one year on the decision making in firms in the following year.
                I will look into testing more models and may need your advise further. Thank you again

                Comment


                • Dear all,

                  Is it possible to use xtdpdgmm if my data consist of large N (around 2000) and T(around 100). and is it possible to include teffects or time dummy in the syntax?
                  i try to run this
                  Code:
                  xtdpdgmm L(0/1).newinv L.(sz lev q return cash) age, model(fodev) collapse gmm(newinv, lag(1 1) collapse) gmm(cash sz lev q return, lag(2 2) collapse diff m(l)) iv(age) teffects two nl(noserial) vce(r)
                  but the result is like this

                  Code:
                  . xtdpdgmm L(0/1).newinv L.(sz lev q return cash) age, model(fodev) collapse gmm(newinv, lag(1 1) collapse) gmm(cash sz lev q return, l
                  > ag(2 2) collapse diff m(l)) iv(age) teffects two nl(noserial) vce(r)
                  
                  Generalized method of moments estimation
                  
                  Fitting full model:
                  
                  Step 1:
                  initial:       f(b) =  .03516697
                  alternative:   f(b) =  12.859461
                  rescale:       f(b) =  .00848645
                  Iteration 0:   f(b) =  .00848645  
                  Iteration 1:   f(b) =  1.285e-08  
                  Iteration 2:   f(b) =  1.284e-08  
                       xtdpdgmm_opt::Xnl():  3900  unable to allocate real <tmp>[114099,113]
                        xtdpdgmm_opt::GS():     -  function returned error
                      xtdpdgmm_opt::Hinv():     -  function returned error
                         xtdpdgmm_opt::V():     -  function returned error
                                xtdpdgmm():     -  function returned error
                                   <istmt>:     -  function returned error
                  r(3900);
                  please help

                  Comment


                  • There appears to be a problem with insufficient memory space on your computer. This is not surprising given the dimensions of your data set. The xtdpdgmm command internally creates some matrices that will become very large with that many time periods. It is not optimized for such data sets as it is designed for estimations with relatively few time periods. Adding time effects is not a good idea in your case because this will create a huge number of extra coefficients to be estimated.
                    https://twitter.com/Kripfganz

                    Comment


                    • Thank you for replying Prof. Kripganz,

                      Originally posted by Sebastian Kripfganz View Post
                      There appears to be a problem with insufficient memory space on your computer. This is not surprising given the dimensions of your data set. The xtdpdgmm command internally creates some matrices that will become very large with that many time periods. It is not optimized for such data sets as it is designed for estimations with relatively few time periods. Adding time effects is not a good idea in your case because this will create a huge number of extra coefficients to be estimated.
                      in this case, how much memory space will be sufficient?

                      also, according to roodman (2009), in his conclusion he suggests to include time dummies, i'm afraid that my result will be bias if i don't include time dummies, what do think?
                      Include time dummies. The autocorrelation test and the robust estimates of the coefficient standard errors assume no correlation across individuals in the idiosyncratic disturbances. Time dummies make this assumption more likely to hold.
                      another thing, roodman (2009) also mentioned this:
                      Apply the estimators to “small T, large N” panels. If T is large, dynamic panel bias becomes insignificant, and a more straightforward fixed-effects estimatorworks. Meanwhile, the number of instruments in difference and system GMM tends to explode with T. If N is small, the cluster–robust standard errors and the Arellano–Bond autocorrelation test may be unreliable.
                      in my case, is it okay to use OLS xtreg to estimate since my data have 100 T ? could you give me some suggestion? thank you so much

                      Comment


                      • I do not have an answer to the memory question. As you quote from Roodman's paper, T should be relatively small. With small T, adding time dummies is indeed recommended and not an issue because the number of extra coefficients remains small. With such a huge T, the dynamic panel bias arising from the inclusion of the lagged dependent variable is indeed not a concern. You could thus simply use xtreg if all other variables are exogenous! If you need to treat other variables as endogenous, you would still need instrumental variables. You could possibly use xtivreg. You would not need to use a system-GMM estimator. Instruments for the first-differenced equation would be sufficient. If you still want to include a large number of time dummies, the community-contributed ivreghdfe command might be helpful as it allows to absorb these dummies without actually adding a coefficient for each of them.
                        Last edited by Sebastian Kripfganz; 04 Jun 2020, 04:49.
                        https://twitter.com/Kripfganz

                        Comment


                        • Thank you Prof Kripfganz for your comment and suggestion. I will try your suggestions, really appreciate your comment.

                          Comment


                          • Hi Prof. Kripfganz,

                            I am writing to ask very basic questions that may seem too primitive to you and the audience:

                            1. If I use dependent var with 2 lags, for instance, xtdpdgmm L(0/2).y x1 x2 x3, model(diff) gmm(y, lag(a1 . )) gmm(x1 x2 x3, lag(a2. )) , do I need to start a1 with 3 and a2 with 1? My understanding is that if I use a1=1 or 2, then the y variable (because of L(0/2)) in the equation will be endogenous with the controls. Please let me know if I am correct.

                            2. How do I decide if I need to use model(diff) or model(fod) or mod(level)?

                            3. In a study with large # of firm-year observations, does the model(mdev) control for the industry effects automatically or do I need to add industry_effects* in the equation? When I use both the industry effects and model(mdev), very few industry effects appear and most of the industries are omitted.

                            4. I am using a four-way interaction, with one of the variables as time effect such that t=0 or t=1, to test difference in differences.
                            Can I use xtdpdgmm L(0/2).y c.x1##c.x2##c.x3##c.t, model(diff) gmm(y, lag(3 . )) gmm(c.x1##c.x2##c.x3##c.t, lag(1 . )) teffects instead of
                            xtdpdgmm L(0/2).y c.x1##c.x2##c.x3##c.t, model(diff) gmm(y, lag(3 . )) gmm(c.x1#c.x2#c.x3#c.t c.x1#c.x2#c.x3 c.x1#c.x2#c.t c.x2#c.x3#c.t c.x1#c.x3#c.t c.x1#c.x2 c.x2#c.x3 c.x1#c.x3 c.x1#c.t c.x2#c.t c.x3#c.t x1 x2 x3 , lag(1 . )) teffects ?

                            I look forward to your reply. Thank you so much for continued support

                            Nishant

                            Comment


                              1. No, adding more lags of the dependent variable as regressors does not mean that you also need to start with higher lags (a1) for the instruments. The reason for starting with the second lag is that the first lag is correlated with the first-differenced error term. The second lag is uncorrelated with the first-differenced error term if the errors are serially uncorrelated. This does not depend on the number of lags of the dependent variable used as regressors. In fact, the more lags of the dependent variable you use as regressors, the more likely it is that the errors are indeed serially uncorrelated.
                              2. model(fod) has the advantage that the transformed errors are still serially uncorrelated if the untransformed errors were serially uncorrelated, while model(diff) produces first-order serial correlation in the transformed error term. As long as you make sure that your instruments are uncorrelated with the transformed error term, it should not really matter which of the two model transformations you use. However, there is one additional benefit of model(fod): If your panel data set is unbalanced with gaps, the model(diff) would lose more observations than model(fod) does. Regarding model(level), this still contains the unobserved time-invariant "fixed effects" (which are removed by the other model transformations) such that you need to take extra care to ensure that your instruments are uncorrelated with them. This can often be hard to justify. Please see my 2019 London Stata Conference presentation and the references therein for details: Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.
                              3. If all firms stay in the same industry throughout the entire sample, i.e. if the industry classification remains constant over time, then model(diff), model(fod), and model(mdev) all account for these effects. In fact, they account for all time-invariant effects by removing them from the transformed model. If you still obtain estimates for some industry effects, this would mean that there must be some variation over time in the industry classification or that you have combined the model(mdev) instruments with further instruments for model(level).
                              4. It looks like the two specifications should be the same, aren't they?
                              https://twitter.com/Kripfganz

                              Comment


                              • Thank you so much for the explanation It is indeed very helpful. And for #4 above, my apologies as c.x1##c.x2##c.x3##c.t gives same results with separate interaction terms.... I realize that I was missing an interaction term, so the results were different..

                                Comment

                                Working...
                                X