Multicollinearity Panel Data

Courtney Bass

Join Date: Sep 2015

Posts: 12
#16

23 Sep 2015, 20:14

This is obviously very late in the game, but is there a difference in testing for near multicollinearity (using estat vif after -reg-) and testing for near multicollinearity after using xtreg? I can't seem to find a way to test for near multicollinearity once I've declared my data a panel (xtset) and run the regression (xtreg). Is this any difference in declaring the data a panel dataset and using reg, then estat vif as opposed to xtreg and then testing for near multicollinearity?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17655
#17

24 Sep 2015, 01:48

Courtney:
this old (and quite long) Stata thread may give some guidance on this tricky topic: http://www.stata.com/statalist/archi.../msg01063.html

Kind regards,
Carlo
(StataNow 18.5)
Comment
Courtney Bass

Join Date: Sep 2015

Posts: 12
#18

24 Sep 2015, 08:56

Thank you, Carlo Lazzaro. This answered my question!
Comment
Shelan Saied

Join Date: Nov 2015

Posts: 1
#19

30 Nov 2016, 06:42

Hello everyone My work in panel data,I so need some references in multicollinearity in panel data,I know panel data less collinearity but also problem in it,I need help
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17655
#20

30 Nov 2016, 06:46

Shelan.
welcome to the list.
Please, start a new thread.
Please, read the FAQ about how to post effectively. Thanks.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Mohammed Kasbar

Join Date: Apr 2017

Posts: 56
#21

10 May 2017, 09:17

Dear Joao,

I do appreciate if you provide a reference for your statement that normality is not important when it comes to hypotheses testing.

Two more questions please, I have a large N (78 panel id with 1,063 observations) and small T (16 years) panel dataset, I ran three different multivariate regressions using the "vce cluster panel id" and "nonest" options to control for potential heteroskedasticity and autocorrelation as proposed by Wooldridge (2002), my 1st question is, do I need to test for normality? if yes, how?
my 2nd question, how can I spot the endogenous variables in my model? I have read about the ivreg and ivreg2 but I am quite confused about the "endog" option because I do not know what is the criteria to use to decide that this particular regressor is endogenous? Please note that I am writing one of my PhD thesis and that's why I need to come across all assumptions tests.

Thanks a lot

I am looking forward to hearing from you.

Originally posted by Joao Santos Silva View Post

Dear Nick,

Thanks for providing these links. I did not read carefully, but the first link looks remarkably misleading to say the least. For example, one of the OLS assumptions they list in 2.0 is:

This is not correct. Asymptotically, normality is not needed for hypothesis tests to be valid. Moreover, unbiased and consistent estimation of the coefficients does not require that the errors be identically and independently distributed. I find it regrettable that this kind of advice is being given and widely distributed.

I also have issues with the second link, and that is a Stata document! For example, in the remarks about the VIFs it is said that when the predictors are highly correlated:

The standard errors are not inflated by collinearity, they are large to reflect the fact that it is difficult to disentangle the effects of different variables. Also, a test for the significance of a coefficient is only informative about that, not about the existence of a "statistical relationship" between the variables. The last part of the sentence gives the impression that this is a consequence of collinearity; it is not, it is a consequence of misinterpreting the result of a significance test.

Once again, thanks for the links, they are very interesting, although for the wrong reasons.

All the best,
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2982
#22

10 May 2017, 13:04

Dear Mohammed Kasbar,

About your first question, notice that what I said is that asymptotically normalityis not needed for hypothesis tests to be valid. Any decent econometrics textbook should say this; see for example Wooldridge's excellent book. So, in the example you give, you do not need to test for normality because the sample is reasonably large.

I am not sure if I understand your question about endogeneity, so I leave it for others to answer.

Best wishes,

Joao
Comment
Mohammed Kasbar

Join Date: Apr 2017

Posts: 56
#23

11 May 2017, 07:55

Dear Joao Santos Silva

Thanks a lot for your reply, Highly appreciated.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17655
#24

11 May 2017, 11:35

Mohammed:
you may be interested in: http://www.statalist.org/forums/foru...est-panel-data

Kind regards,
Carlo
(StataNow 18.5)
Comment
Mohammed Kasbar

Join Date: Apr 2017

Posts: 56
#25

14 May 2017, 08:38

Carlo Lazzaro Joao Santos Silva

Thanks a lot
Comment
Tobias Wendler

Join Date: Nov 2017

Posts: 51
#26

16 May 2018, 06:50

Hey everyone, I hope posting here is not too late to receive valuable feedback, if needed I will start a new thread

I am currently estimating difference GMM with N and T both equal to 20.
My main variable of interest is significant at diffrent specifications, with the coefficient being relatively similar. Now, as the VIF in my sample is quite high I am a little bit concerned on what to do and my question is as follows:

Am I correct that the multicollinearity does not at all affect the coefficient but solely the standard error ?
If so, is it correct that the standard error can only be caused to be higher than would be appropriate, but not lower ?

These two questions are extremely importants, as given that both statements are correct I could reasonably assume that my variable of interest is in fact significant, correct ?

Thanks so much for the great advice always receiving here
Comment
Koffi Yves YA

Join Date: Apr 2019

Posts: 7
#27

08 Apr 2019, 12:36

Hello dear Stata netizens.
I am a PhD student in economics. The tests carried out revealed the presence of heteroscedasticity and the logTFM variable suffers from endogeneity. I present you my estimate.
My concern is whether this resulat is good or there are still other tests to perform.
I have also made dynamic panel estimates with GMM mothodes but the number of instruments is too large and exceeds the number of groups. I extended my estimate then on 32 countries but the results are not satisfactory. The number of instruments is now less than the number of groups but no variables are significant other than the delayed dependent variable.
I first present the result with fixed effects: the choice was made through the Hausman test but Khi-two is negative so I added the option "sgmalex" and do the Mundlak test. Both tests revealed that the effect model is more appropriate.
Thank you

xtivreg IDHIx logPIBH TIBCP IDEx VOIX CIFSPx (logTFM = logTFM), fe vce(robust) small
Fixed-effects (within) IV regression Number of obs = 104
Group variable: COUNTRY1 Number of groups = 13
R-sq: Obs per group:
within = 0.6540 min = 8
between = 0.6337 avg = 8.0
overall = 0.6200 max = 8
F( 19, 85) = 65.33
corr(u_i, Xb) = -0.5259 Prob > F = 0.0000
(Std. Err. adjusted for 13 clusters in COUNTRY1)
------------------------------------------------------------------------------
| Robust
IDHIx | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
logTFM | 1.477364 .5774156 2.56 0.012 .3293067 2.62542
logPIBH | 3.257721 2.672786 1.22 0.226 -2.056494 8.571935
TIBCP | .1090626 .0331055 3.29 0.001 .04324 .1748852
IDEx | -.0688856 .010758 -6.40 0.000 -.0902754 -.0474958
VOIX | 2.137235 2.137934 1.00 0.320 -2.113549 6.38802
CIFSPx | .3615744 .1220411 2.96 0.004 .118924 .6042248
_cons | -40.76229 11.37653 -3.58 0.001 -63.38188 -18.14269
-------------+----------------------------------------------------------------
sigma_u | 3.3837642
sigma_e | 1.8626068
rho | .76745988 (fraction of variance due to u_i)
------------------------------------------------------------------------------
Instrumented: logTFM

Instruments: logPIBH TIBCP IDEx VOIX CIFSPx logTFM
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17655
#28

08 Apr 2019, 23:21

Koffi Yves:
welcome to this forum.
Please, start a new thread.
Please, read the FAQ about how to post effectively. Thanks.

Kind regards,
Carlo
(StataNow 18.5)
Comment
lal mohan kumar

Join Date: May 2019

Posts: 265
#29

06 Mar 2023, 03:57

Dear Stata Members
I am reopening this relatively old thread (but very informative one) to clarify some doubts about multicollinearity. In an article referred below, it is written that "In principle, collinearity does not bias the OLS, although it inflates variation within the model and increases the danger of type II errors (false negative) with regard to the variable we are primarily interested in (Wooldridge, 2003, 96).
I have read Jeff Wooldridge but I am not sure whether the above statement is correct or not. So in principle does mulitcollinearity increase the likelihood of type II error (which means less chance for type I error).
My apologies if this thread cannot be used for a new question!

https://link.springer.com/article/10...e.jibs.8400225

Last edited by lal mohan kumar; 06 Mar 2023, 04:00.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29911
#30

06 Mar 2023, 10:06

Yes, the statement in Wooldridge is correct. However, your subsequent interpretation of it is not. It does increase the likelihood of type II error. But it does not reduce the chance of type I error.
2 likes
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment