You are not logged in. You can browse but not post. Login or Register by clicking 'Login or Register' at the top-right of this page. For more information on Statalist, see the FAQ.
How to interpret different VIF values and coefficients before/after introducing interaction terms?
So as you can see, the VIF changed a lot with/without interaction.
Curiously, the explanatory variables "risk" and "scopeinprofit" are both not significant without interaction,
but becomes significant after adding their interaction terms with Exchange and Noncross.
Why? and is this significance reliable?
To be clear, I'm using panel data and random effect, linear regression. Thanks a lot!
Adrian:
welcome to the list.
1) predictors with such high VIF should be ruled out form the analysis (the staistical significance you detected with interacted terms is driven by the high VIF; hence, it is untrustworthy);
2) if you ran -xtreg, re- -estat vif- should have returned an error message, as it not allowed after -xt- regression. How did you calculate VIF values?
Curiously, the explanatory variables "risk" and "scopeinprofit" are both not significant without interaction,
but becomes significant after adding their interaction terms with Exchange and Noncross.
I will spare you my long rant about why VIF is a waste of time and pixels and shouldn't be done in the first place, let alone worried about. I will point out that interaction terms are, necessarily, highly correlated with their constituent terms, so high values of VIF are to be expected in interaction models.
When you add interaction terms to a model, the meanings of the constituent terms of the interaction change. Thus the coefficient of risk in the non-interaction model is an estimate of the (average) effect of risk on the outcome variable. But when you interact it with, say Exchange, it no longer means that and there is no reason why the value or the "significance" of the value should be the same or even similar to the non-interaction result. In the interaction model, there is no such thing as the effect of risk. Rather, by using an interaction model, you are stipulating that risk's effect on the outcome depends on the value of Exchange. In particular, what is shown as the coefficient of risk in the interaction model is the effect of risk on the outcome when Exchange = 0.
Now, it is often the case that Exchange = 0 never actually occurs in the data, or isn't even possible in principle. In that case the coefficient of risk has no meaning whatsoever. That is part of what you take on when using an interaction model. But even when Exchange = 0 is a real situation, you must remember that the coefficient of risk reflects effect of risk only for that limited situation. It is not comparable to the average effect of risk shown as the coefficient in the non-interaction model.
One of the solutions to address the collinearity caused by interaction terms is to use centered variables. For more details see Aiken, L. S., West, S. G., & Reno, R. R. (1991). Multiple regression: Testing and interpreting interactions. Sage.
You might also look at Belsley, Kuh, and Welsch, Colinearity Diagnostics, and Belsley & Kuh, Model Reliability. Arthur Goldberger's text has a somewhat entertaining coverage of colinearity.
Comment