Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dear Prof. Sebastian Kripfganz

    In the regression where I have the simultaneity issue (i.e., before addressing the endogeneity), can we rely on the value of the resulting R2 from this regression? or would the simultaenity issue and the associated endogeneity can distort the value of the R2?

    Could you please guide me on this?

    Comment


    • I am not sure why you would even care about this R2. It gives you the fraction of the variance of the dependent variable that is explained by the variance of the linear prediction from the explanatory variables. This linear projection does not have a causal interpretation; thus, the notion of endogeneity is not meaningful in this context.
      https://www.kripfganz.de/stata/

      Comment


      • Thank you very much Prof. Sebastian Kripfganz

        What we do in the first step, before using GMM, is estimate our models with FE and show different specifications. We also report the R² and adjusted R². As I understand from your answer, I can rely on the R² values to compare my specifications, since R² is not affected by the endogeneity problem caused by the simultaneity issue. Is that correct?

        Comment


        • Dear Sebastian
          in your post 367
          https://www.statalist.org/forums/for...40#post1652040
          you suggest an incremental Hansen test for the validity of the moment conditions for the lagged dependent variable to check for cross-sectional dependence
          I wonder if a standard cd test on residuals may suffice.
          Thanks for your reply

          Comment


          • The CD test is not designed for use after GMM estimation. I do not have enough experience with this test to provide a more solid judgement.
            https://www.kripfganz.de/stata/

            Comment


            • Thank you Sebastian
              you are right and maybe in this setting we should go for the Sargan's difference test as suggested by Sarafidis, Yamagata and Robertson


              https://www.sciencedirect.com/scienc...608001826#sec3

              thanks again

              Comment


              • Dear Sebastian,

                I have a question regarding to your quote below:

                Originally posted by Sebastian Kripfganz View Post

                Please have a look at my 2019 London Stata Conference presentation. Slide 67 tells you the admissible lags under forward-orthogonal deviations. For strictly exogenous regressors (which I implicitly assumed here for w and k), any lag can be used. For predetermined regressors (L.n), lag 0 is the first admissible lag. In my specification, I assumed indeed that there are no endogenous variables (with respect to the idiosyncratic error term) in the model.

                For the level model, slide 31 tells you the additionally available instruments. For strictly exogenous and predetermined regressors, lag 0 of the first-differenced instruments is usually used (because it has the strongest correlation with the instrumented variables compared to any other lag). It is common practice not to use multiple lags as instruments for the level model (based on the idea that further lags would be redundant if all available lags were used for the transformed model).
                I did not understand how lag 0 is the first admissible lag if L.n is a regressor in the estimation equation. I will give you a example.

                Assume that you would like to run a model with reverse causality:
                1. Yit = Yit-1 + Yit-2 + Xit-1 + Xit-2 + εit
                2. Xit = Xit-1 + Xit-2 + Yit-1 + Yit-2 + εit
                But, when differenced in FOD estimation, Xi,t-1 appears directly in (Xi,t-1 - Xi,t-2) in equation 1. I do not understand how can I use Xi,t-1 as an instrument - not because of correlation with the error terms, but because it's a regressor in the differenced equation. But when I define Xit-1 as predetermined regressor it means lag 0 (Xit-1) is the first admissible lag. Or if I define it as endogenous, then lag 1 (Xit-2) can be used but it is also included as regressor. It this something that I should be concerned? Should I start from lag 3 and use gmm(Xi,t-1, lag(3 .))?

                Or should I just think that the sequential exogeneity condition (allowing εit to correlate with future x values) accounts for reverse causality regardless of how I specify the lag structure in the model. So, I can use regressors as theory points and classify the instruments based on coherence tests?

                Thank you for your reply.

                Best regards,
                Nursena

                Comment


                • I think there are at least two things that are potentially confusing here.

                  First, in my quoted post, lag 0 of L.n refers to L.n itself, which is lag 1 of n.

                  Second, the FOD transformation and the first-difference transformation are not the same. (Xi,t-1 - Xi,t-2) is the first-difference transformation of Xi,t-1. The FOD transformation would subtract the average of Xi,t-1 Xit Xi,t+1 ... Xi,T from Xi,t-1. This average does not contain Xi,t-2.

                  In your two-equations example. Xit is predetermined because it is a function of Yit-1. But Xit-1 is uncorrelated with the error term in εit (and any future error term) in equation 1, which is all that is needed for it to be a valid instrument (in the FOD-transformed model).
                  https://www.kripfganz.de/stata/

                  Comment


                  • Dear Sebastian,

                    Thank you for your reply.

                    Originally posted by Sebastian Kripfganz View Post

                    In your two-equations example. Xit is predetermined because it is a function of Yit-1. But Xit-1 is uncorrelated with the error term in εit (and any future error term) in equation 1, which is all that is needed for it to be a valid instrument (in the FOD-transformed model).
                    I'd like to clarify my understanding of instrument validity under different transformations.

                    For my model: Yit = Yit-1 + Yit-2 + Xit-1 + Xit-2 + εit

                    Under FOD transformation:
                    • I can use L.Xit-1 as an instrument for Xit-1
                    • And similarly L.Xit-2 as an instrument for Xit-2
                    • While having Yit-1 and Yit-2 as regressors
                    Under first-difference transformation:
                    • I need to start from L2.Xit-1 as instrument bcs. Xit-1 appears in the differenced equation as (Xit-1 - Xit-2)
                    Is my understanding of these different requirements under FOD versus first-differences correct?

                    Thank you for your help in clarifying this.

                    Best regards,
                    Nursena

                    Comment


                    • In a nutshell: Yes.

                      You should still think about what your model assumptions imply about the error term, because this will ultimately decide the instrument validity. If Xit is correlated with εit (because of simultaneity), then Xi,t-1 is predetermined with respect to εit (i.e., Xi,t-1 is uncorrelated with εit but correlated with εi,t-1). In the first-differenced model, the error term is εiti,t-1. Thus, Xi,t-1 would be correlated with this first-differenced error term but Xi,t-2 is not (assuming that the errors are not serially correlated). With the FOD transformation, the transformed errors are only a function of current (period t) and future errors. Thus, Xi,t-1 remains a valid instrument.
                      https://www.kripfganz.de/stata/

                      Comment


                      • Dear Sebastian,

                        Thank you for your reply. I use the following model as main estimation.

                        Code:
                         
                         xtdpdgmm L(0/2).(depression) L(1/2).income, model(fodev) collapse gmm(depression, l(1 4))  gmm(L.income, l(0 4)) teffects two vce(r) nocons overid
                        And the produced instrument list is below as expected.

                        Code:
                         
                        Instruments corresponding to the linear moment conditions:
                         1, model(fodev):
                           L1.depression L2.depression L3.depression L4.depression
                         2, model(fodev):
                           L.income L1.L.income L2.L.income L3.L.income L4.L.income
                         3, model(level):
                           3bn.wave 4.wave 5.wave 6.wave 7.wave 8.wave 9.wave 10.wave 11.wave 12.wave
                           13.wave 14.wave 15.wave 16.wave 17.wave
                        However, when I add interaction terms for gender for all regressors, instrument set looks wrong.

                        Code:
                         
                         xtdpdgmm L(0/2).(depression) L(1/2).income c.L1.depression#ib1.gender c.L2.depression#ib1.gender c.L1.income#ib1.gender c.L2.income#ib1.gender, model(fodev) collapse gmm(depression, l(1 4))  gmm(L.income, l(0 4)) gmm(c.L1.depression#ib1.gender, l(1 4)) gmm(c.L2.depression#ib1.gender, l(1 4))  gmm(c.L1.income#ib1.gender, l(0 4))  gmm(c.L2.income#ib1.gender, l(0 4)) teffects two vce(r) nocons overid
                        As you can see below, it drops L3.depression from row 1 and L.income from row 2 as instrument. Then, shows arbitrary matches for gender categories and lags.

                        Code:
                        Instruments corresponding to the linear moment conditions:
                         1, model(fodev):
                           L1.depression L2.depression L4.depression
                         2, model(fodev):
                           L1.L.income L2.L.income L3.L.income L4.L.income
                         3, model(fodev):
                           L2.(0.gender#cL.depression) L4.(0.gender#cL.depression)
                           L1.(1b.gender#cL.depression) L2.(1b.gender#cL.depression)
                           L4.(1b.gender#cL.depression)
                         4, model(fodev):
                           L2.(0.gender#cL2.depression) L4.(0.gender#cL2.depression)
                           L4.(1b.gender#cL2.depression)
                         5, model(fodev):
                           0.gender#cL.income L3.(0.gender#cL.income) 1b.gender#cL.income
                         6, model(fodev):
                           0.gender#cL2.income L1.(0.gender#cL2.income) L4.(0.gender#cL2.income)
                           L3.(1b.gender#cL2.income) L4.(1b.gender#cL2.income)
                         7, model(level):
                           3bn.wave 4.wave 5.wave 6.wave 7.wave 8.wave 9.wave 10.wave 11.wave 12.wave
                           13.wave 14.wave 15.wave 16.wave 17.wave
                        Do you have any recommendation for me to correct the estimation code?

                        Thank you in advance.

                        Best regards,
                        Nursena

                        Comment


                        • In the specification of the instruments, the base category for the gender dummy is ignored. The command creates interaction terms with both 0.gender and 1.gender. This leads to perfect collinearity with some of the non-interacted instruments, which is why some of them are dropped. You could explicitly specify interaction terms as c.L1.depression#0.gender etc to ensure that only this particular group is interacted.
                          https://www.kripfganz.de/stata/

                          Comment


                          • Dear Sebastian,

                            Thank you for your explanation. Specifying them explicitly added relevant instruments for interaction with the first lags but not with the second lags. Is this because L1.(0.gender#cL.depression) = (0.gender#cL2.depression), L2.(0.gender#cL.depression)=L1.(0.gender#cL2.depre ssion) so they are dropped in model 4 below? Can I still rely on the produced output with these dropped instruments?

                            Code:
                            Instruments corresponding to the linear moment conditions:
                             1, model(fodev):
                               L1.depression L2.depression L3.depression L4.depression
                             2, model(fodev):
                               L.income L1.L.income L2.L.income L3.L.income L4.L.income
                             3, model(fodev):
                               L1.(0.gender#cL.depression) L2.(0.gender#cL.depression)
                               L3.(0.gender#cL.depression) L4.(0.gender#cL.depression)
                             4, model(fodev):
                               L4.(0.gender#cL2.depression)
                             5, model(fodev):
                               0.gender#cL.income L1.(0.gender#cL.income) L3.(0.gender#cL.income)
                               L4.(0.gender#cL.income)
                             6, model(fodev):
                               L1.(0.gender#cL2.income) L4.(0.gender#cL2.income)
                             7, model(level):
                               3bn.wave 4.wave 5.wave 6.wave 7.wave 8.wave 9.wave 10.wave 11.wave 12.wave
                               13.wave 14.wave 15.wave 16.wave 17.wave
                            Best regards,
                            Nursena

                            Comment


                            • Yes, some of those lagged interaction terms will be equal to each other. As far as I can tell, this should be fine.
                              https://www.kripfganz.de/stata/

                              Comment


                              • Dear Sebastian,

                                I have a follow-up question.
                                Originally posted by Sebastian Kripfganz View Post
                                I think there are at least two things that are potentially confusing here.


                                In your two-equations example. Xit is predetermined because it is a function of Yit-1. But Xit-1 is uncorrelated with the error term in εit (and any future error term) in equation 1, which is all that is needed for it to be a valid instrument (in the FOD-transformed model).
                                You have stated that for the two-equations example below:
                                1. Yit = Yit-1 + Yit-2 + Xit-1 + Xit-2 + εit
                                2. Xit = Xit-1 + Xit-2 + Yit-1 + Yit-2 + εit
                                If I change the first equation to:

                                1. Yit = Yit-1 + Yit-2 + Xit + Xit-1 + εit

                                which includes contemporaneous X, can I still assume that Xit is predetermined? When I look at the incJ test, it does not reject the null hypothesis that the additional overidentifying restriction for predetermined X is valid (p-value =0.95).

                                Would your response change if I have second equation as:

                                2. Xit = Xit-1 + Xit-2 + Yit + Yit-1 + εit

                                Thank you in advance!

                                Best regards,
                                Nursena

                                Comment

                                Working...
                                X