Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mean centering IV's to create interactions in Panel fixed effects analysis

    Hi,

    I am sorry if the questions looks simple. I know that in cross sectional data it is recommended to center continuous IV's that will be used to create interaction terms (both to reduce multicollinearity and make interpretation of the intercept more meaningful) , but i was wondering if mean-centering is recommended in fixed effects models. My confusion is probably coming from not knowing if in longitudinal data mean-centering is done using within subject mean or grand mean of the IV. If mean -centering is done using between subject mean then fixed effects is also calculated by subtracting within subject mean from each individual observation. Therefore, the same mean subtracting takes place twice. Does that mean that in fixed effects models there is no need to mean-center in order to be able to use interactions?

    Thanks,
    Hovhannes

  • #2
    First of all, centering of variables is optional in interaction models, not required. It is often helpful in simplifying the interpretation of the results, but if you use factor-variable notation and the -margins- command to interpret the results, there is no particular advantage to centering. (Well, occasionally there are numerical issues with uncentered variables that are resolved by centering, but this doesn't really come up very often.)

    As for whether to group mean-center or grand-mean center, it is, again, a matter of choice. And, again, since the purpose of centering is just to make interpretation simpler, you should choose the type of centering that will make it simplest to interpret your results in light of your specific research hypotheses and goals.

    As for the de-meaning that takes place when -xtreg, fe- is run, it takes place behind the scenes. You never see those variables, and the results you are shown are all in terms of the variables as you entered them into the model, so this doesn't affect your decisions about centering.

    So, to summarize, in nearly all situations, centering is optional (and of little value if you use factor variable notation and -margins-) and, assuming you choose to center, the choice of centering point should be whatever most facilitates interpreting the results in light of your specific research questions.

    (One situation, an exception to the generalizations expounded above, in which centering, though still optional, can greatly facilitate interpretation of results that cannot be achieved with the -margins- command, is when estimating the correlation between random intercepts and random slopes in multi-level models.)

    Comment


    • #3
      Hi Clyde,

      Thank you so very much for the reply. It seems to me that in centering in recommended to not only facilitate interpretation ( which as you mentioned can be done just as well with margins command) but also to address structural multicollinearity that arises in polynomial regression models and models that include interactions terms (https://onlinecourses.science.psu.edu/stat501/node/349/ ). It is mostly the issue of multicollinearity that I have in mind when considering centering. Do you think centering is a solution to multicollinearity and thus a must step in the above mentioned models.

      The confusion with xtreg, fe comes from the following - if I don't center a particular IV then the behind the scenes process would be subtracting group mean from each observation in the group ( let's say I am considering panel regression of countries of 20 years) i.e.( Xij - X.j ). If I center first then my observations are not Xij but (Xij - X.j ) which then would be further subject to (Xij - X.j ) - mean of (Xij - X.j ) for each group in order to do the behind the scenes xtreg, fe and this is not the same math as the one without centering. I feel like I might be really off with my logic but would really appreciate if you could help me to understand this.

      Thank you again very much!

      Comment


      • #4
        Do you think centering is a solution to multicollinearity and thus a must step in the above mentioned models.
        Well, I agree that centering is a solution to multicolinearity, but I do not agree that multicolinearity is a problem that requires a solution. Occasionally it does, but in most cases it does not.

        First of all, the degree of multicolinearity between main effects and interaction terms, even without centering, is usually not very large. Second of all, even when it is, in my work, where I tend to focus more on model predictions and don't care so much about significance tests of individual model terms, the predictions of the model come out the same whether you have multicolinearity or not. It's the same model, just differently parameterized. So except in the most extreme cases where we get to the point of numerical instability, if you look at model predicted values and marginal effects at appropriately specified values of the variables, you get the same answers either way. Before -margins- it took a lot of work to calculate marginal effects at appropriately specified values, but with -margins- it's no effort to speak of.

        The same answer applies to your second question. What Stata does behind the scenes does not make any difference here. Think of it this way: the demeaning approach is just one way to get a fixed-effects estimator. It's the one Stata uses, probably because it is computationally the simplest. But there are other ways to do it, and they produce the same results. And they don't de-mean the variables, so this whole issue you raise here would not arise with them. And since they produce the same results, it follows that the double de-meaning is just not a problem.

        So my opinion is still that centering is optional and should be used when, and only when, it is convenient. WIth the -margins-command, most of the work that uncentered variables entailed is done for you, so the centering is superfluous. Yes there are some cases of extreme multi-colinearity, but they seldom arise in real world data. When they do, it is easy to spot them in the results as the standard errors will be ridiculously large or you get numerical instability in estimation.

        Added: By the way, I highly recommend you read the chapter in Goldberger's econometrics textbook in which he thoroughly demonstrates why multicolinearity is a non-issue. It's clearly written and very entertaining.
        Last edited by Clyde Schechter; 03 Aug 2018, 21:58.

        Comment


        • #5
          Thank you very much Clyde, those are very comprehensive answers, I really appreciate your time!

          Comment

          Working...
          X