Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variance Inflation Factor (VIF) for Fixed Effects Regression

    Hello,

    I am conducting a piece of research which has required time and entity fixed effects.


    I want to examine any issues of multicollinearity in my data and have decided to use the VIF test. I am aware that scores over 10 indicate issues of multicollinearity.


    When dealing with a fixed effects regression, must you include the time and entity dummies in the VIF test? Without the dummies, all of my independent variables score very low, desirable scores-under 5. However, when I introduce the dummies and run the test, the scores are mostly extremely high- some around 6,000.

    Is it incorrect to run the VIF test solely on the independent variables or must you include the time and entity dummies?


    Thanks.

  • #2
    Whenever you have a series of variables that are indicators ("dummies") for the levels of a categorical variable, the VIF is going to be high. That's to be expected, and it isn't a problem. There is no point in putting such variables into VIF.

    More generally, there is no point using VIF at all. You have already, in writing your post in #1, wasted more of your time worrying about "multicolinearity" than it is worth. It is a bogus issue, a statistical zombie that refuses to die. The best exposition of this is in Arthur Goldberger's textbook of econometrics, where he devotes an entire chapter to this and points out that "multicollinearity" is really just a misnomer for small sample size (or "hyponumerosity," as he calls it.) If you can't get your hands on his text, or don't have time to read the chapter, you can see a much condensed version written by Bryan Caplan at https://www.econlib.org/archives/200...ollineari.html.

    Here's the bottom line. The only effect that "multicollinearity" has on regression results is to inflate standard errors and, consequently, widen confidence intervals. If your key variables get coefficients with sufficiently narrow confidence intervals that your research question has been answered, then multicollinearity is not an issue. If "control variables" are affected, then that is not important because, by definition, a "control variable" is only included to deal with its nuisance effects--its coefficients are not of any importance. The only time multicolinearity can be a problem is if it affects a key explanatory variable, one that is the focus of your research. In that case, you have a problem. But, in that case, there is also no solution to the problem. You just have to acknowledge that your data set provides inconclusive results and move on. The only way to get a conclusive study is to start over with a different, larger (usually much larger), data set, or a new study design that overcomes the near-colinear relationships that afflict your key variable.

    So just run your regression and see whether you get narrow enough confidence intervals to draw conclusions about your focal study variable(s). If so, you are done. If not, you are also done, but have only an inconclusive study. Perhaps at that point, using VIF might be of some help in identifying which variables are nearly colinear with your focal variable(s), as that might enable you to design a new study that avoids the problem.

    Comment


    • #3
      Thank you so much for the quick and really helpful response!

      Comment

      Working...
      X