Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • GMM: Forward and backward orthogonal deviations (xtabond2)

    Dear Statalisters,

    it is the first time I use GMM and xtabond2.
    Playing around with the syntax, after reading the paper of Roodman (2009, How to do xtabond2), I recognized that orthogonal is applicable as an option and as an suboption within the gmmstyle() option.

    Imagine, for instance, a twoway difference GMM with Windmeijer’s finite-sample correction for the two-step covariance matrix.
    I've read in Roodman (2009) that you usually use forward orthogonal deviations transforming instead of differencing if the panel is unbalanced.
    Therefore, assuming unbalanced data in a simple model with a predetermined lagged dependent variable (l.y) and two endogenous explanatory variables, x1 and x2, as well as an exogenous time dummy the code would be:
    Code:
    xtabond2 y l.y x1 x2 year1-year10, gmm(l.y, collapse lag(1 .)) gmm(x1 x2, collapse lag(2. )) iv(year1-10) noleveleq twoway robust orthogonal
    While in the previous example orthogonal specified forward orthogonal deviations for the regressors, the following code would additionally specify backward orthogonal deviations for the instruments.
    Code:
    xtabond2 y l.y x1 x2 year1-year10, gmm(l.y, collapse orthogonal lag(1 .)) gmm(x1 x2, collapse orthogonal lag(2. )) iv(year1-10) noleveleq twoway robust orthogonal
    According to Hayakawa (2009) the combination of backward orthogonal deviations for the instruments and forward for the regressors is less biased and more stable than traditional Difference GMM for a standard AR(1) model when T>=10.
    However, what exactly does the orthogonal suboption in the gmmstyle() option do?
    Applying this to my data I do get very different results using the suboption compared to the example without it. Particularly using the suboption of orthogonal in my gmmystyle() options allows me to NOT reject the Arellano-Bond test for AR(2) in first differences, whereas I have to reject it without the orthogonal suboption.

    Thanks for any comments
    Best,
    Guest
    Last edited by sladmin; 05 Aug 2021, 09:03.

  • #2
    Backward-orthogonal transformations of the instruments follow a very similar logic to forward-orthogonal deviations of the regression equation. With the former, the average of all previous observations is subtracted from the instruments. With the latter, the average of all subsequent observations is subtracted from the regression equation. (In both cases, the transformed variables are scaled appropriately to leave the variance unaltered, presuming homoskedasticity over time.)

    I personally recommend against using backward-orthogonal deviations. There is nothing gained in terms of validity of the instruments. But due to the subtraction of the backward-mean, you are effectively losing one more observation because the mean of the previous observations does not exist for the first observation. In essence, you are throwing away valuable information, which is particularly costly if T is small. In several simulations that I ran, I never really found a case where using backward-orthogonal deviations would have been really benefitial. In some cases, you might even completely lose identification of your coefficients, leading to very unreasonable estimates.

    Another annoying bit about backward-orthogonal deviations is that, unlike differencing, it is not innocuous if you specify gmm(l.y, collapse orthogonal lag(1 .)) or gmm(y, collapse orthogonal lag(2 .)) because the lag of a backward-orthogonally transformed variable is not the same as the backward-orthogonal deviation of the lagged variable.

    Btw: Make sure that you have updated xtabond2 to the latest version because there was a bug with forward-orthogonal deviations in a previous version.
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Thank you very much for your quick response @Sebastian Kripfganz!

      Related to that, I have two additional questions:
      1. As already noted in the post above, my Arellano-Bond test for AR(2) in first differences can almost solely be rejected specifying the orthogonal suboption in the gmmsytle() option. Am I artificially making my specification valid by using the orthogonal option, or is this an acceptable specification?
      2. If I have two gmmstlye() options (i.e. one for the predetermined lagged DV and one for the endogenous variables), and if I want to specify a backward orthogonal transformation, do I have to do it in both gmmstyle() options, or am I free to use backward orthogonal transformations for the endogeneous variables only? (This is, indeed, a more general concern I have with other suboptions, such as collapse.)
      To clarify: Is
      Code:
      xtabond2 y l.y x1 x2 year1-year10, gmm(l.y, collapse orthogonal lag(1 .)) gmm(x1 x2, collapse orthogonal lag(2. )) ...
      equally valid as
      Code:
      xtabond2 y l.y x1 x2 year1-year10, gmm(l.y, collapse lag(1 .)) gmm(x1 x2, collapse orthogonal lag(2. )) ...
      ?


      Best,
      Guest
      Last edited by sladmin; 06 Aug 2021, 00:55.

      Comment


      • #4
        1. Whether you specify the orthogonal suboption or not does not alter the underlying model assumptions. The AR test therefore is essentially inconclusive and indicates a lack of robustness.
        2. From a theoretical point of view, you do not have to specify the same suboptions for different gmm() options. However, you might find it hard to justify using different approaches within the same estimation. After all, you want to avoid giving the impression that you have played around with the specification until you got the desired results.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Thank you again, your comments help me a lot!

          1. I see your point, however, is there any general advice to improve the robustness of the model in this case?
          2. Thanks for your advice, that was exactly my thought, too.

          I may bother you with a last question?
          Reading some applications of difference GMM today, I wondered whether one should specify a GMM model with lagged explanatory variables.
          For instance:
          Code:
          xtabond2 y l.y lagged_x1 lagged_x2 year1-year10, gmm(l.y, collapse lag(1 .)) gmm(lagged_x1 lagged_x2, collapse lag(2. )) ...
          While some authors argue that this excludes simultaneity, I think this is a rather poor argument as GMM already accounts for these concerns, right?
          According to a response from you in the forum (https://www.statalist.org/forums/for...bond2-xtdpdgmm) this corresponds to your argument.

          In my paper I plan to first run a fixed effects model. In this case, it does make sense from a theoretical perspective to lag all right hand side variables as I expect the effects on y to be delayed.
          (Because of the Nickell-Bias that arises from the FE and the lagged DV, the FE model is solely reported for the sake of completeness and I then plan to introduce the GMM model).
          My question is: Would you agree that it does make sense to run a FE model using lagged right hand side variables but use non-lagged right hand side variables in the GMM model?

          Comment


          • #6
            Regarding robustness: In this particular case, I would refer to my initial comment that backward-orthogonal deviations may lead to quite erratic estimates and therefore should better be ignored, in particular when the results differ a lot from alternative specifications.

            Lagging regressors because of endogeneity concerns is usually not recommended, in particular when you can use lags as instruments instead in a GMM procedure, which takes care of the endogeneity. You would essentially be assuming that there is only a delayed effect of those variables.

            I do not see why using lagged regressors would be justified when estimating the model with the FE estimator. The assumed relationship between the variables ("the model") is still the same. Also, the FE estimator builds on the assumption that all independent variables are strictly exogenous with regard to the idiosyncratic error component. Lagging them is not of any help here.
            https://www.kripfganz.de/stata/

            Comment

            Working...
            X