Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    This is obviously very late in the game, but is there a difference in testing for near multicollinearity (using estat vif after -reg-) and testing for near multicollinearity after using xtreg? I can't seem to find a way to test for near multicollinearity once I've declared my data a panel (xtset) and run the regression (xtreg). Is this any difference in declaring the data a panel dataset and using reg, then estat vif as opposed to xtreg and then testing for near multicollinearity?

    Comment


    • #17
      Courtney:
      this old (and quite long) Stata thread may give some guidance on this tricky topic: http://www.stata.com/statalist/archi.../msg01063.html
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #18
        Thank you, Carlo Lazzaro. This answered my question!

        Comment


        • #19
          Hello everyone My work in panel data,I so need some references in multicollinearity in panel data,I know panel data less collinearity but also problem in it,I need help

          Comment


          • #20
            Shelan.
            welcome to the list.
            Please, start a new thread.
            Please, read the FAQ about how to post effectively. Thanks.
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #21
              Dear Joao,

              I do appreciate if you provide a reference for your statement that normality is not important when it comes to hypotheses testing.

              Two more questions please, I have a large N (78 panel id with 1,063 observations) and small T (16 years) panel dataset, I ran three different multivariate regressions using the "vce cluster panel id" and "nonest" options to control for potential heteroskedasticity and autocorrelation as proposed by Wooldridge (2002), my 1st question is, do I need to test for normality? if yes, how?
              my 2nd question, how can I spot the endogenous variables in my model? I have read about the ivreg and ivreg2 but I am quite confused about the "endog" option because I do not know what is the criteria to use to decide that this particular regressor is endogenous? Please note that I am writing one of my PhD thesis and that's why I need to come across all assumptions tests.

              Thanks a lot

              I am looking forward to hearing from you.



              Originally posted by Joao Santos Silva View Post
              Dear Nick,

              Thanks for providing these links. I did not read carefully, but the first link looks remarkably misleading to say the least. For example, one of the OLS assumptions they list in 2.0 is:



              This is not correct. Asymptotically, normality is not needed for hypothesis tests to be valid. Moreover, unbiased and consistent estimation of the coefficients does not require that the errors be identically and independently distributed. I find it regrettable that this kind of advice is being given and widely distributed.

              I also have issues with the second link, and that is a Stata document! For example, in the remarks about the VIFs it is said that when the predictors are highly correlated:



              The standard errors are not inflated by collinearity, they are large to reflect the fact that it is difficult to disentangle the effects of different variables. Also, a test for the significance of a coefficient is only informative about that, not about the existence of a "statistical relationship" between the variables. The last part of the sentence gives the impression that this is a consequence of collinearity; it is not, it is a consequence of misinterpreting the result of a significance test.

              Once again, thanks for the links, they are very interesting, although for the wrong reasons.

              All the best,

              Comment


              • #22
                Dear Mohammed Kasbar,

                About your first question, notice that what I said is that asymptotically normalityis not needed for hypothesis tests to be valid. Any decent econometrics textbook should say this; see for example Wooldridge's excellent book. So, in the example you give, you do not need to test for normality because the sample is reasonably large.

                I am not sure if I understand your question about endogeneity, so I leave it for others to answer.

                Best wishes,

                Joao

                Comment


                • #23
                  Dear Joao Santos Silva

                  Thanks a lot for your reply, Highly appreciated.

                  Comment


                  • #24
                    Mohammed:
                    you may be interested in: http://www.statalist.org/forums/foru...est-panel-data
                    Kind regards,
                    Carlo
                    (StataNow 18.5)

                    Comment


                    • #25
                      Carlo Lazzaro Joao Santos Silva

                      ​​​​​​​Thanks a lot

                      Comment


                      • #26
                        Hey everyone, I hope posting here is not too late to receive valuable feedback, if needed I will start a new thread

                        I am currently estimating difference GMM with N and T both equal to 20.
                        My main variable of interest is significant at diffrent specifications, with the coefficient being relatively similar. Now, as the VIF in my sample is quite high I am a little bit concerned on what to do and my question is as follows:

                        Am I correct that the multicollinearity does not at all affect the coefficient but solely the standard error ?
                        If so, is it correct that the standard error can only be caused to be higher than would be appropriate, but not lower ?

                        These two questions are extremely importants, as given that both statements are correct I could reasonably assume that my variable of interest is in fact significant, correct ?

                        Thanks so much for the great advice always receiving here

                        Comment


                        • #27
                          Hello dear Stata netizens.
                          I am a PhD student in economics. The tests carried out revealed the presence of heteroscedasticity and the logTFM variable suffers from endogeneity. I present you my estimate.
                          My concern is whether this resulat is good or there are still other tests to perform.
                          I have also made dynamic panel estimates with GMM mothodes but the number of instruments is too large and exceeds the number of groups. I extended my estimate then on 32 countries but the results are not satisfactory. The number of instruments is now less than the number of groups but no variables are significant other than the delayed dependent variable.
                          I first present the result with fixed effects: the choice was made through the Hausman test but Khi-two is negative so I added the option "sgmalex" and do the Mundlak test. Both tests revealed that the effect model is more appropriate.
                          Thank you

                          xtivreg IDHIx logPIBH TIBCP IDEx VOIX CIFSPx (logTFM = logTFM), fe vce(robust) small
                          Fixed-effects (within) IV regression Number of obs = 104
                          Group variable: COUNTRY1 Number of groups = 13
                          R-sq: Obs per group:
                          within = 0.6540 min = 8
                          between = 0.6337 avg = 8.0
                          overall = 0.6200 max = 8
                          F( 19, 85) = 65.33
                          corr(u_i, Xb) = -0.5259 Prob > F = 0.0000
                          (Std. Err. adjusted for 13 clusters in COUNTRY1)
                          ------------------------------------------------------------------------------
                          | Robust
                          IDHIx | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                          logTFM | 1.477364 .5774156 2.56 0.012 .3293067 2.62542
                          logPIBH | 3.257721 2.672786 1.22 0.226 -2.056494 8.571935
                          TIBCP | .1090626 .0331055 3.29 0.001 .04324 .1748852
                          IDEx | -.0688856 .010758 -6.40 0.000 -.0902754 -.0474958
                          VOIX | 2.137235 2.137934 1.00 0.320 -2.113549 6.38802
                          CIFSPx | .3615744 .1220411 2.96 0.004 .118924 .6042248
                          _cons | -40.76229 11.37653 -3.58 0.001 -63.38188 -18.14269
                          -------------+----------------------------------------------------------------
                          sigma_u | 3.3837642
                          sigma_e | 1.8626068
                          rho | .76745988 (fraction of variance due to u_i)
                          ------------------------------------------------------------------------------
                          Instrumented: logTFM

                          Instruments: logPIBH TIBCP IDEx VOIX CIFSPx logTFM

                          Comment


                          • #28
                            Koffi Yves:
                            welcome to this forum.
                            Please, start a new thread.
                            Please, read the FAQ about how to post effectively. Thanks.
                            Kind regards,
                            Carlo
                            (StataNow 18.5)

                            Comment


                            • #29
                              Dear Stata Members
                              I am reopening this relatively old thread (but very informative one) to clarify some doubts about multicollinearity. In an article referred below, it is written that "In principle, collinearity does not bias the OLS, although it inflates variation within the model and increases the danger of type II errors (false negative) with regard to the variable we are primarily interested in (Wooldridge, 2003, 96).
                              I have read Jeff Wooldridge but I am not sure whether the above statement is correct or not. So in principle does mulitcollinearity increase the likelihood of type II error (which means less chance for type I error).
                              My apologies if this thread cannot be used for a new question!



                              https://link.springer.com/article/10...e.jibs.8400225
                              Last edited by lal mohan kumar; 06 Mar 2023, 05:00.

                              Comment


                              • #30
                                Yes, the statement in Wooldridge is correct. However, your subsequent interpretation of it is not. It does increase the likelihood of type II error. But it does not reduce the chance of type I error.

                                Comment

                                Working...
                                X