Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #46
    Thanks Prof. Sebastian for this update. I have following two queries:

    1. I have a dataset spanning 2001-2016 for 10 variables (1 dependent and 9 independent). In the first case, I use xtdpdgmm to regress dependent variable on independent variables with time dummies (i.e. teffects) and the output shows coefficients for years 2003-16 (along with the intercept). This means that coefficients for two years (2001 and 2002) are not reported. In the second case, when I include an independent variable whose values are available only for the period 2007-2016, the output shows coefficients for 2008-16 (along with the intercept). This means that coefficient for only 1 year (2007) is not reported. Why is there a difference in the number of unreported time coefficients in the two cases?

    2. I need to multiply (or interact) a variable called "FC Score" with 14 other variables (which are proxies of ownership structure) in 14 different regressions, one by one. Since mathematical operations are not allowed in the xtdpdgmm command, I would like to know if there's any other option than to construct these interaction variables separately in the dataset and then import the same for regressions using xtdpdgmm? I ask this because I have to create many more such interaction variables.

    Thanks and Regards
    Prateek

    Comment


    • #47
      1. In the full sample, the year 2001 is excluded from the estimation sample because the lagged dependent variable is not observed (i.e. the dependent variable is not observed for 2000). The dummy for the year 2002 is additionally excluded to avoid the dummy trap. In the reduced sample with the additional independent variable, the lagged dependent variable for 2007 is still observed (i.e. the dependent variable is observed for 2006) and thus 2007 is kept in the estimation sample. Still, one dummy needs to be excluded to avoid the dummy trap.

      2. As of today, there is unfortunately no easy way around this problem because xtdpdgmm does not yet allow factor variables.
      https://www.kripfganz.de/stata/

      Comment


      • #48
        Thank you so much for your crystal clear responses. Below are some further queries.

        1. My dataset has firms as cross-sectional units. I noticed that xtdpdgmm provides coefficients for time-invariant variables as well if they are introduced in the model. I wonder how is it possible (or even appropriate) to receive these coefficients because while using fodev option, these variables should get wiped out from the model. In fact, this is the reason why it is claimed that firm-fixed effects have been controlled for in the model. How valid or acceptable is it to include time-invariant variables in our dynamic panel models. Further, even if we don't include any time-invariant variable in our model, is it correct to claim that the effect of such variables has already been controlled for in dynamic panel estimation?

        2. How do we manually calculate the number of instruments which are shown on the top left side of the output?

        3. Could you please throw some light on the meaning of m-statistic in the context of panel data analysis?

        Thanks and Regards
        Prateek

        Comment


        • #49
          1. If you specify the estimator with instruments for the diff or fodev model only, then you should not obtain coefficient estimates for the time-invariant variables. If you also specify instruments for the level model, i.e. you are using a system GMM estimator, then the command would also compute coefficient estimates for the time-invariant variables. Those variables are not wiped out automatically, only indirectly if all instruments are orthogonal to them (which holds for model(diff), model(fodev), and model(mdev) instruments). That does not imply that those coefficients are meaningfully estimated when you include differences variables as instruments for the level model. These instruments are typically very weak and assumed to be (asymptotically) uncorrelated with any time-invariant variable. To meaningfully estimate such coefficients, you would need to specify level instruments that are sufficiently strongly correlated with the time-invariant regressors. This could be the time-invariant regressors themselves, effectively assuming that they are uncorrelated with the unobserved effects (which then are strictly speaking no longer "fixed effects"). If you are not interested in those time-invariant variables, then you can safely argue that they are accounted for by the unobserved time-invariant effects. For further discussion, see Kripfganz and Schwarz (2019). Estimation of linear dynamic panel data models with time-invariant regressors. Journal of Applied Econometrics 34 (4), 526-546.

          2. That can become difficult, in particular if there are many instruments. Some of those specified instruments might be dropped internally due to perfect collinearity among the instruments. This is reflected in the shown instrument count. After the estimation, the new version of the command displays the utilized instruments below the regression table which allows you to manually count them and to check which instruments are actually included.

          3. What is the m-statistic?
          https://www.kripfganz.de/stata/

          Comment


          • #50
            Thanks again Sir for such detailed responses!! Here are some further questions

            1. This is a bit too technical for me as of now. I shall read and learn more on this. Anyway, for now, I hope it is safe to argue that time-invariant variables (none of them is my variable of interest) have been taken care of in case of system GMM. To be more specific, I use model(fodev) option for gmmiv() and model(level) for iv().

            2. I would like to know your opinion on reportage of system GMM results when p value of Sargan-Hansen statistic exceeds limits such 0.6/0.7/0.8. Should these results be reported?

            3. Regarding m-statistic, it is a statistics used to determine serial correlation in the dependent variable in case of dynamic panel models (analogous to the Arellano-Bond test). It gets reported in Eviews.

            4. The data for my variable of interest (which is an exogenous variable) spans over 2007-2016 whereas data for all other variables (including the dependent variable) runs from 2001-2016. I run system GMM using xtdpdgmm and specify lags for endogenous variables in gmmiv() as (1 14) i.e. I use maximum lag length possible considering the period 2001-2016. Is there any issue with this practice considering that my variable of interest only runs from 2007-2016. Should I restrict my lag length for endogenous variables only to 10 keeping in view the period 2007-2016?

            Thanks and Regards
            Prateek

            Comment


            • #51
              1. If you do not specify them explicitly as regressors, all time-invariant variables are part of the error term. This error term is not explicitly modelled. They are taken care of by chosing the instruments in a way such that they are uncorrelated with any time-invariant variables. For the instruments specified with the model(fodev) option, this is automatically the case and you do not need to worry about them. For the instruments specified with model(level), it is only an assumption that these instruments are uncorrelated with everything that is in the error term. You need to justify that these variables are indeed uncorrelated. (In that sense, GMM does not necessarily take care of these omitted time-invariant variables.)

              2. You should watch out for a potential problem of too many instruments. If you feel confident that such a problem does not exist and there is no obvious model misspecification, then I do not see a reason for not trusting these p-values.

              3. I do not know about EViews but I believe that the m-statistic actually is the Arellano-Bond statistic.

              4. There should not be a problem with using these additional lags other than that such deep lags might become weak instruments (which is independent from your question about the data availability).
              https://www.kripfganz.de/stata/

              Comment


              • #52
                Thanks a lot Prof, Sebastian for these brilliant answers and such superlative guidance and support!! Really appreciate!!

                Comment


                • #53
                  Another update is available on my website:
                  Code:
                  net install xtdpdgmm, from(http://www.kripfganz.de/stata/) replace
                  The new version 2.2.0 now allows factor variables and interaction terms in the list of regressors and instruments. This is still a "beta feature". If you experience anything odd, please let me know.
                  https://www.kripfganz.de/stata/

                  Comment


                  • #54
                    Hi Prof. Sebastian. At the outset, thanks a lot for this much awaited update. I used it for running a few regressions and compared the output to that obtained by manual inclusion of interaction terms. The results were same. So this new feature seems to work fine. Thanks once again!!

                    Comment


                    • #55
                      Another update with a small bug fix is available now. Thanks to Kit Baum, this latest version 2.2.1 can now be installed from SSC as well, in case you cannot access my personal website due to network restrictions in your location:
                      Code:
                      ssc install xtdpdgmm, replace
                      (The SSC version is less frequently updated than the version on my own website.)
                      https://www.kripfganz.de/stata/

                      Comment


                      • #56
                        Dear All,

                        I tried to use xtdpdgmm routine. I estimate two models with and without nonlinear moment conditions. First I run:

                        Code:
                        xtdpdgmm wlnyw $regressors, gmm(wlnyw pc pc2 lnsnda, l(2 4) m(d)) gmm(wlnyw n g_ef nda, l(2 2) m(l)) two vce(r) iv(i.time lnwi)
                        
                        
                        . xtdpdgmm wlnyw $regressors, gmm(wlnyw pc pc2 lnsnda, l(2 4) m(d)) gmm(wlnyw n g_ef nda, l(2 2) m(l)) two vc
                        > e(r) iv(i.time lnwi)
                        
                        Generalized method of moments estimation
                        
                        Fitting full model:
                        Step 1         f(b) =  .02272711
                        Step 2         f(b) =  .65419341
                        
                        Group variable: id                           Number of obs         =       970
                        Time variable: time                          Number of groups      =       152
                        
                        Moment conditions:     linear =     109      Obs per group:    min =         1
                                            nonlinear =       0                        avg =  6.381579
                                                total =     109                        max =         8
                        
                                                           (Std. Err. adjusted for 152 clusters in id)
                        ------------------------------------------------------------------------------
                                     |              WC-Robust
                               wlnyw |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                               wlnyw |
                                 L1. |   .9283051   .0099569    93.23   0.000     .9087898    .9478204
                                     |
                                  pc |  -.0687962   .0267801    -2.57   0.010    -.1212843   -.0163081
                                 pc2 |   .0073601   .0027237     2.70   0.007     .0020218    .0126985
                                lnwi |   .1815688   .0258987     7.01   0.000     .1308083    .2323293
                               lnnda |   -.140108   .0516486    -2.71   0.007    -.2413374   -.0388787
                                     |
                                time |
                                  2  |          0  (empty)
                                  3  |    .490499   .1412412     3.47   0.001     .2136713    .7673267
                                  4  |   .4182786   .1374508     3.04   0.002       .14888    .6876771
                                  5  |   .4505429   .1349003     3.34   0.001     .1861432    .7149425
                                  6  |   .5039114   .1445831     3.49   0.000     .2205336    .7872891
                                  7  |   .5227873   .1408505     3.71   0.000     .2467253    .7988493
                                  8  |   .5295376   .1369318     3.87   0.000     .2611562    .7979191
                                  9  |   .5189791   .1388805     3.74   0.000     .2467783    .7911799
                                 10  |   .4920312   .1372646     3.58   0.000     .2229975    .7610649
                                     |
                               _cons |          0  (omitted)
                        ------------------------------------------------------------------------------
                        Instruments corresponding to the linear moment conditions:
                         1, model(diff):
                           4:L2.wlnyw 5:L2.wlnyw 6:L2.wlnyw 7:L2.wlnyw 8:L2.wlnyw 9:L2.wlnyw
                           10:L2.wlnyw 4:L3.wlnyw 5:L3.wlnyw 6:L3.wlnyw 7:L3.wlnyw 8:L3.wlnyw
                           9:L3.wlnyw 4:L4.wlnyw 5:L4.wlnyw 6:L4.wlnyw 7:L4.wlnyw 8:L4.wlnyw
                           10:L4.wlnyw 11:L4.wlnyw 3:L2.pc 4:L2.pc 5:L2.pc 6:L2.pc 7:L2.pc 10:L2.pc
                           11:L2.pc 3:L3.pc 4:L3.pc 5:L3.pc 6:L3.pc 10:L3.pc 11:L3.pc 3:L4.pc 4:L4.pc
                           5:L4.pc 7:L4.pc 8:L4.pc 9:L4.pc 10:L4.pc 11:L4.pc 3:L2.pc2 4:L2.pc2
                           7:L2.pc2 8:L2.pc2 9:L2.pc2 10:L2.pc2 11:L2.pc2 3:L3.pc2 7:L3.pc2 8:L3.pc2
                           9:L3.pc2 10:L3.pc2 11:L3.pc2 4:L4.pc2 5:L4.pc2 6:L4.pc2 7:L4.pc2 8:L4.pc2
                           9:L4.pc2 10:L4.pc2 4:L2.lnsnda 5:L2.lnsnda 6:L2.lnsnda 7:L2.lnsnda
                           8:L2.lnsnda 9:L2.lnsnda 4:L3.lnsnda 5:L3.lnsnda 6:L3.lnsnda 7:L3.lnsnda
                           8:L3.lnsnda
                         2, model(level):
                           4:L2.wlnyw 5:L2.wlnyw 6:L2.wlnyw 7:L2.wlnyw 8:L2.wlnyw 9:L2.wlnyw
                           10:L2.wlnyw 3:L2.n 4:L2.n 5:L2.n 6:L2.n 7:L2.n 8:L2.n 9:L2.n 11:L2.n
                           3:L2.g_ef 4:L2.g_ef 5:L2.g_ef 6:L2.g_ef 7:L2.g_ef 8:L2.g_ef 10:L2.g_ef
                           11:L2.g_ef 3:L2.nda 4:L2.nda 5:L2.nda 6:L2.nda 7:L2.nda
                         3, model(level):
                           3bn.time 4.time 5.time 6.time 7.time 8.time 9.time lnwi
                         4, model(level):
                           _cons
                        Then I try:

                        Code:
                        xtdpdgmm wlnyw $regressors, gmm(wlnyw pc pc2 lnsnda, l(2 4) m(d) c) gmm(wlnyw n g_ef nda, l(2 2) m(l) c) two vce(r) iv(i.time lnwi) nl(iid)
                        In this case I have a neverending sequence of iterations without achiving convergence. Is there any problem with my command line? Should I interpret the lack of convergence as an evidence that non linear conditions are not necessary?

                        Thanks in advance.

                        Dario

                        Comment


                        • #57
                          The convergence problem probably arises because of the colinear dummy variables. I recommend to use the teffects option instead of specying the time dummies manually.

                          A few more comments on your specification:
                          • When using the system GMM estimator, the instruments for the level model are usually first differenced, i.e. specify gmm(wlnyw n g_ef nda, l(2 2) diff m(l)). This is not done by default!
                          • The additional nonlinear moment conditions are usually not very informative any more after adding the level moment conditions. They are primarily used when all (or most) other moment conditions refer to the first-differenced (or forward-orthogonally deviated) model.
                          https://www.kripfganz.de/stata/

                          Comment


                          • #58
                            Sebastian Kripfganz Thanks a lot for your clarifications.

                            Comment


                            • #59
                              Hi,

                              I am also applying xtdpdgmm as a substitute (and hopefully an improvement) for xtabond2.
                              The results for some coefficients are quite unstable, so I would like to be as precise as possible in my application.

                              I am espacially interested in forward orthogonal deviations (instead of differencing). Did I understand it correctly, Sebastian, that forward deviations are incorrectly implemented in xtabond2 no matter how we specify the model? or is it simply the lag structure which might cause mistakes.
                              I would like to simply apply SysGMM in the sense of Blundell and Bond (2002) but with forward deviations instead of first differences. x4 and x5 are strictly exogenous.
                              So can you tell me what is "wrong" with this command (yes, I do not want to include l.y as a regressor but I could):
                              Code:
                              xtabond2 y x1 x2 x3 x4 x5 i.time, gmm(l.x1 l.x2 l.x3, lag(1 3)) iv(x4 x5 i.time, eq(level)) iv(x4 x5 i.time, eq(diff)) twostep robust ortho
                              And should I specify two gmm options with eq(diff) and eq(level) instead of the defalut (both)? I remember you pointing that this is very important in the iv() option in other topics.

                              However, I think that specifying the model with xtdpdgmm should look somehow like this (without Ahn and Schmidt (1995) Moment Conditions):
                              Code:
                              xtdpdgmm y x1 x2 x3 x4 x5, gmmiv(x1 x2 x3, model(level) diff lag(1 3)) gmmiv(x1 x2 x3, model(fodev) lag(1 3)) iv(x4 x5, model(fodev)) iv(x4 x5, model(diff)) twostep vce(robust) teffects
                              However, is there a way to use forward-demeaned instruments in the level equation? Im my example I simply applied first differences instead.

                              Looking at the instruments used for the level and demeaned equations this seems quite ok. In the xtabond2 Iapplied lags in the gmm equation because as you said the lag structure of both commands is different.

                              Concerning the new Moment Conditions in xtdpdgmm, can we use the Arellano/Bond AR-Test for autoregression in the idiosynchratic errors (implemented in xtabond2) to check if we can apply the ln(noserial) option in your command? I think this is a test should suit the question.

                              Comment


                              • #60
                                There are situations in which xtabond2 gets the forward-orthogonal deviations right, for example when you use the default lag orders and do not specify any standard iv() instruments. When you specify the lag structure yourself, you need to be very careful as explained earlier. When you combine orthogonal deviations with any standard iv() instruments in your model, irrespective of whether they relate to the transformed or the level model, xtabond2 appears to always produce incorrect results.

                                In your case, the options iv(x4 x5 i.time, eq(level)) iv(x4 x5 i.time, eq(diff)) together with ortho would trigger the bug in xtabond2.

                                Specifying separate gmm() options in xtabond2 is not necessarily needed if you are sure to understand what xtabond2 is doing. In my view, specifying them separately reduces the risk of doing something unintended.

                                With xtdpdgmm, I do not think that specifying iv(x4 x5, model(fodev)) iv(x4 x5, model(diff)) jointly has any benefit. You usually would either use the forward-orthogonal deviations or first differencing. If you meant to specify the instruments for model(level) instead of model(diff), keep in mind that this also requires the assumption that x4 x5 are uncorrelated with the unobserved "fixed effects".

                                Why do you want to use forward-demeaned instruments? That does not sound like a good idea to me. If anything, you might want to backward-demean the instruments. That's not currently possible for the level model unless you create those instruments manually before using the xtdpdgmm command. If you are sure that this is a good thing to do, you could use my tstransform command for that purpose:
                                Code:
                                net install tstransform, from("http://www.kripfganz.de/stata/")
                                
                                help tstransform
                                Your specified lag structure makes sense in both commands under the assumption that x1 x2 x3 are endogenous. With the standard Blundell/Bond system-GMM procedure, for the level model it is usually just the first lag that is used (although it is not necessarily wrong to use lags 1 to 3).

                                The Arellano/Bond AR-tests are implemented in the xtdpdgmm package as well. See the postestimation command estat serial. These should be satisfied indeed when using the nl(noserial) option.
                                https://www.kripfganz.de/stata/

                                Comment

                                Working...
                                X