Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in Fixed Effect and Random Effects

    Hello! I am trying to understand the different Panel Data models and I am getting confused by the different terms used, i.e., Random effect models and Random effects estimators and Fixed effect models and Fixed effect estimators- are these 4 all different?

    Is it the case that you can have a fixed effect estimator in a random effects model and if you do that your estimate is not consistent?

    For context, I am running a panel data analysis in STATA and after conducting a Hausman Test I obtain Prob > chi2 = 0.4504, indicating I should use a random effects model. However, I strongly doubt that my individual effects are uncorrelated with my x variables. Should I go ahead with FE or follow what is recommended? I have seen that a command, sigmamore can also be added to the test as it is less likely to produce a non–positive-definite-differenced covariance matrix, but how do I check if I need this?

    Also, does RE model being the best-suited model mean I simply use the ,re output as my result or does it mean there's something in my variables and controls that needs to be changed or considered?

    I am running a regression of total inflow migration on gov spending on healthcare per capita. My other x variables are population density, share of elderly, total tax revenue, GDP per capita and Gini.

    I understand that I am not to include variables that do not vary over time within a country- or is that only for fixed effects? So I'm not sure if my x variables are 'correct'

    I'm sorry these are quite a few questions but any help at all would be GREATLY appreciated. The more that I try to read about panel data models, the more I get myself confused so if someone could explain my specific case that would be amazing. Thank you so so much!!
    Last edited by Amelia Guner; 06 Mar 2022, 08:48.

  • #2
    So there are a few things to say here. FE (ostensibly) soaks up any time invariant and observed effects in your given units. As my old methods teacher once said, in his thick Kentucky accent, "There's something that makes Alabam-er, Alabam-er". Thus, graphically, it forces all panels to have the same slope of of the intercepts, at different points along the y axis (if I recall correctly). Generally speaking, unless you're doing some experiment, 9 times out of 10, using FE will make sense, but the Hausman test can assist in arbitration this.

    Random effects doesn't make this assumption, namely, that there's no/ negligible unobserved confounding. Thus, graphically, it allows the level and slope of your panels to vary on the y axis. If I'm not mistaken, you can include time invariant coefficients here, but whether that makes sense in your case is a separate question.

    I don't teach this stuff (yet), and I'm sure others may correct me if I don't speak truth here.


    EDIT: Different literatures will refer to that differently. Random-slope models in the mixed effects modeling literature, more common in public health and psychology, oftentimes refer to multilevel models that use both fixed and random.

    Also, as Wooldridge will tell us, the model refers to the specific variables you've chosen, and the estimator refers to how you're calculating your coefficients (OLS, WLS, negative binomial, logit).

    But, people sometimes use these interchangeably.
    Last edited by Jared Greathouse; 06 Mar 2022, 09:09.

    Comment


    • #3
      Random effects the way stata means it are just fixed effects under the assumption that the u_i terms are drawn from a common distribution and thus are pooled towards one another, the re specification will force all the group-level slopes to be the same just as fixed effects does (unless you explicitly tell it otherwise).

      For context, I am running a panel data analysis in STATA and after conducting a Hausman Test I obtain Prob > chi2 = 0.4504, indicating I should use a random effects model. However, I strongly doubt that my individual effects are uncorrelated with my x variables. Should I go ahead with FE or follow what is recommended?
      If you strongly doubt the individual effects are uncorrelated with your regressors you can include the group level means of the regressors as variables and use re, or just go with fe for a more "robust" approach (the fe option is what I suspect most people on this forum would suggest). Failing to reject a test is not especially relevant when you have a preconceived notion of the truth, null hypothesis testing usually doesn't give a strong reason to believe any particular version of the world.

      I understand that I am not to include variables that do not vary over time within a country- or is that only for fixed effects? So I'm not sure if my x variables are 'correct'
      If you include variables that don't vary within groups fe will drop them, re will not, another nice thing about random effects.

      Comment


      • #4
        A couple of additional comments:

        1. When you do a Hausman test, what you are doing, in fact, is a test of the hypothesis that the FE and RE regressions give the same results. If the test was "non-significant" that means that the two sets of results are not "significantly" different. So if you take this kind of hypothesis testing seriously, Hausman is telling you that your results will be the same either way.

        2. An important difference between FE and RE is that the parameters being estimated in an FE regression are within-panel parameters. With RE, there is an implicit assumption that within- and between-panel effects are the same, and the results you get are a weighted linear combination of within- and between-panel effects. If your research question focuses on within-panel effects, I would advise using -fe- even when Hausman says you can use -re.

        3. It is mathematically impossible to get estimates of the effects of time-invariant regressors from FE. If that is a goal of your research, you must use RE, even if Hausman says no. Given the fact that RE models are more subject to omitted variable bias than FE models, or if your goals include estimating some within- and some between-panel effects, in this situation it may make sense to use -xthybrid-, by Francisco Perales and Reinhard Schunck, available from SSC to get separate within- and between- effects.

        4. It isn't really correct to say that with FE every panel has the same slope on each variable, and that with RE the slopes vary across panels. You can get FE slopes to vary across panels by including explicit interaction terms in the model. And with RE models, varying slopes are also available as cross-level interactions when you want them, but they are not obligatory. So both FE and RE can handle models with varying slopes and also models with slopes that do not vary across panels.

        The correct use of terminology, as pointed out by Jared Greathouse , is what Wooldridge recommends. However, the terms model and estimator (or estimation) are often used interchangeably--over time you will get used to it and be able to infer from context, in nearly all settings, what the intended meaning is.

        If you perceive differences between my views and Jared's, to some extent we are reflecting the views of our disciplines: he is an economist and I am an epidemiologist. The use of FE and RE is rather different in our fields, sometimes contentiously so.

        Added: Crossed with #3, which raises several additional good points.

        Comment


        • #5
          Jared Greathouse Jackson Monroe Clyde Schechter

          Thank you very very much for your thoughts and suggestions! I feel like I've gotten a much better grasp of what is going on.

          So to summarize, since the result of the Hausman test is essential that there is no significant difference between the results using both models, I can use either as long as I am able to justify why. If I choose FE then I am going to do a within-country analysis and therefore I don't need to account for the differences between countries is hence is more 'robust' but this estimator is less efficient. If I use RE then I am getting both within and between country results and it is assumed that the individual country effects aren't correlated with my regressors.

          Am I right in saying the above?

          But a question, I had been working under the assumption that my model would use FE. If I decide to go with RE would I now need to add controls for differences across countries, for instance, in this context control for maybe corruption or educational level etc, which does not change significantly over the years within a country but differs across countries. Or would I not need to change anything since Hausman has indicated that I am able to use RE with what I already have.

          Comment


          • #6
            Yeah that's pretty much right. In my opinion, the research decisions should be more informed by, as silly as it might sound, common sense than by null hypotheses testing.

            I'm replicating a paper right now with a synthetic control estimator; the original author compared Louisville to 500ish untreated units (and followed suit for the rest of the 100 something treated units). My approach will be to break the U.S. into different regions (using some sort of "official" demarcation) and re- analyze the treated units, only comparing them to other counties in the SAME region. Even if some statistical test said I could use the other 500 untreated units, I wouldn't care because from a design perspective, from a scientific perspective, encoding expert/sensible knowledge into your design and estimator almost always is better than saying "Null hypothesis test says no underlying heterogeneity" even though there obviously is.


            If I were doing this paper honestly, I would report the results of each model side by side. With RE it's mathematically tractable to use time invariant covariates, but whether it makes sense in your case to do this is another question. It doesn't seem like you need to change much based off your description, barring other design issues.

            Comment


            • #7
              Amelia:
              as an aside to previous excellent replies, I would only add that if -hausman- outcome points you to -re- and you decide to go -fe- notwithstanding, your sample estimates will be consistent but not efficient, as -fe- uses within panel variation only for standard errors calculation.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                We've mentioned allowing the slopes in FE models to vary by an interaction term. This reminds me of factor models which are absolutely ubiquitous in economics, for some reason.

                If we allow the slopes to vary in this manner, do you think this is similar to the idea of an interactive fixed-effects estimator like say, this one proposed by Xu? Carlo Lazzaro

                Comment


                • #9
                  I'm going to deviate from some of the previous comments. Because fixed effects allows correlation between the unobserved heterogeneity and RE does not, FE is the more robust estimator in the sense that it is consistent under weaker assumptions. Therefore, you should use it if it the estimates you care about are reasonably precise. Using RE is almost an act of desperation because your FE standard errors are too larger (confidence intervals too wide). Without showing results I can't make a recommendation because I don't know if the RE and FE estimators are practically close or how their precisions compare.

                  The Hausman test is a test of the null hypothesis that the RE estimator is consistent -- that is, the unobserved heterogeneity is uncorrelated with the covariates. The Hausman test need not have very good power against alternatives, which means we tend to use RE too much.

                  Generally, you won't get complaints using FE. You will get objections to RE. But if the RE and FE estimates are practically similar, the RE (robust) standard errors smaller than the FE (robust) standard errors, and the robust Hausman test fails to reject RE, then you can get away with RE in many cases. If you have a randomized intervention then of course you can use RE. In my recent difference in differences work I show that a certain RE estimator is equivalent to FE.

                  As alluded to, you should make your Hausman test robust to serial correlation/heteroskedasticity. I recommend the Mundlak correlated random effects approach. Plus, that gives you fixed effects and you include any time-constant variables.

                  In my view, there is one model: an unobserved effects model. There are different estimators: pooled OLS, random effects (a feasible GLS estimator), fixed effects or within, and first difference. All can be applied to the same model.

                  Allowing different slopes in an FE environment typically requires a substantial number of time periods. And it often doesn't change the estimate of the average effect. Heterogeneous slopes need not cause bias in estimates of the average effects.

                  Comment


                  • #10
                    Jared:
                    I do not know the estimator you mentioned (BTW: thanks for sharing the reference); hence, I cannot say.
                    Kind regards,
                    Carlo
                    (StataNow 18.5)

                    Comment


                    • #11
                      Okay. You're most welcome! Carlo Lazzaro

                      Heterogenous slopes (which Xu seems to describe in his interactive fixed-effects SCM paper above) needing multiple pre-int periods seems to be common. Why might this be?

                      I've used this sort of estimator before, and it really didn't matter regarding the ATT, but I couldn't intuit why one needed many time periods and the other didn't need as much.Jeff Wooldridge

                      Comment

                      Working...
                      X