OLS v. Fixed Effects results reliability

Constantin Domizlaff

Join Date: Jun 2021
Posts: 21

OLS v. Fixed Effects results reliability

09 Mar 2024, 10:14

I am running into the following thing that has made me curious:

I am running a panel regression about the impact of board committee characteristics on carbon performance. I was advised to first run an OLS regression to get an overview before doing a fixed/random effects regression, so I did. I have then decided for FE model over the RE model.

The results are the following: For the OLS model I get an extremely high adjusted R-squared of ~75%. A lot of this is due to industry fixed effects I have in my model, without them the adjusted R-squared drops to ~42%. Also, in the OLS, around half of my predictor variables are significant.
On the other hand, in the FE model where the industry fixed-effects are omitted, due to being time-invariant, I only get an adjusted R-squared of ~22% and less of my variables are statistically significant.

Here are the OLS results:

Code:

 reg EMTOTAL NOMCOMM NOMCOMM_IND COMPCOMM COMPCOMM_IND AUDCOMM AUDCOMM_IND GOVCOMM ATT SUSCOMM BSIZE BGD INC INDEP DUAL ROA LEV FSIZE MULT SKILLS i.YEAR i.INDUSTRY, vce(cluster ID)

Linear regression                               Number of obs     =      2,546
                                                F(75, 388)        =          .
                                                Prob > F          =          .
                                                R-squared         =     0.7579
                                                Root MSE          =     1.1201

                                   (Std. err. adjusted for 389 clusters in ID)
------------------------------------------------------------------------------
             |               Robust
     EMTOTAL | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     NOMCOMM |    .501477   .5371908     0.93   0.351    -.5546921    1.557646
 NOMCOMM_IND |  -.0049362   .2602447    -0.02   0.985    -.5166026    .5067301
    COMPCOMM |  -.1953639   .2510876    -0.78   0.437    -.6890264    .2982987
COMPCOMM_IND |   1.507128    .893569     1.69   0.092    -.2497146    3.263972
     AUDCOMM |  -.4016043   1.053488    -0.38   0.703    -2.472865    1.669656
 AUDCOMM_IND |  -2.396646    .913957    -2.62   0.009    -4.193574   -.5997178
     GOVCOMM |   .2469164   .2710377     0.91   0.363    -.2859699    .7798027
         ATT |   .0151676   .0049427     3.07   0.002     .0054498    .0248855
     SUSCOMM |   .2270284   .1260847     1.80   0.073    -.0208663    .4749231
       BSIZE |  -.1703795   .3331368    -0.51   0.609    -.8253587    .4845997
         BGD |  -.0110272   .0070541    -1.56   0.119    -.0248962    .0028419
         INC |   .2015798   .0960159     2.10   0.036     .0128031    .3903564
       INDEP |   .0002575   .0068561     0.04   0.970    -.0132223    .0137373
        DUAL |   -.009173   .1303307    -0.07   0.944    -.2654158    .2470697
         ROA |  -.1805006   .0730591    -2.47   0.014    -.3241418   -.0368593
         LEV |   .3685102    .296991     1.24   0.215    -.2154028    .9524232
       FSIZE |   .7882242   .0731855    10.77   0.000     .6443344    .9321139
        MULT |  -.0323264   .1311898    -0.25   0.805    -.2902582    .2256055
      SKILLS |  -.0026919   .0026403    -1.02   0.309    -.0078829    .0024992

These are the FE results:

Code:

. xtreg EMTOTAL NOMCOMM NOMCOMM_IND COMPCOMM COMPCOMM_IND AUDCOMM AUDCOMM_IND GOVCOMM ATT SUSCOMM BSIZE BGD INC INDEP DUAL ROA LEV FSIZE MULT SKILLS i.YEAR, fe vce(cluster ID)

Fixed-effects (within) regression               Number of obs     =      2,549
Group variable: ID                              Number of groups  =        390

R-squared:                                      Obs per group:
     Within  = 0.2249                                         min =          1
     Between = 0.3603                                         avg =        6.5
     Overall = 0.3284                                         max =         13

                                                F(30, 389)        =          .
corr(u_i, Xb) = 0.3480                          Prob > F          =          .

                                   (Std. err. adjusted for 390 clusters in ID)
------------------------------------------------------------------------------
             |               Robust
     EMTOTAL | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     NOMCOMM |   .2934868   .2109382     1.39   0.165    -.1212349    .7082085
 NOMCOMM_IND |  -.1267084   .0921178    -1.38   0.170    -.3078194    .0544026
    COMPCOMM |   .0854662   .1225279     0.70   0.486    -.1554336    .3263661
COMPCOMM_IND |   .9833365   .4859865     2.02   0.044     .0278475    1.938825
     AUDCOMM |  -.1399784   .4012153    -0.35   0.727    -.9288003    .6488434
 AUDCOMM_IND |   -1.16394   .2971362    -3.92   0.000    -1.748134   -.5797461
     GOVCOMM |   .1810519   .1828703     0.99   0.323    -.1784858    .5405896
         ATT |  -.0001274   .0017421    -0.07   0.942    -.0035525    .0032978
     SUSCOMM |   .0207709   .0453892     0.46   0.647    -.0684679    .1100098
       BSIZE |  -.0695785   .1128549    -0.62   0.538    -.2914604    .1523034
         BGD |  -.0004053   .0023553    -0.17   0.863     -.005036    .0042255
         INC |   .0135879   .0255077     0.53   0.595    -.0365623    .0637381
       INDEP |   -.004724   .0034665    -1.36   0.174    -.0115395    .0020914
        DUAL |  -.0351191   .0541888    -0.65   0.517    -.1416586    .0714204
         ROA |  -.0077768   .0202317    -0.38   0.701     -.047554    .0320004
         LEV |  -.2681079   .2017146    -1.33   0.185    -.6646951    .1284792
       FSIZE |   .5235958   .0943601     5.55   0.000     .3380761    .7091155
        MULT |   .2038137    .089293     2.28   0.023     .0282564    .3793711
      SKILLS |   .0004215   .0011727     0.36   0.719    -.0018842    .0027272

Now what has made me curious, in this case, which results can be considered more reliable/useful? I have also plotted the residuals of both models against the dependent variable to visually assess which model does a better job at predicting the dependent variable, in this case the OLS seems to do so. I am also aware that OLS and FE do not measure exactly the same thing (within variation reported by FE while OLS , but I would still like to hear opinions on these results as I have relatively little experience overall.

Thanks a lot in advance!

Last edited by Constantin Domizlaff; 09 Mar 2024, 10:15. Reason: OLS

Tags: fixed effects, OLS, panel data

Clyde Schechter

Join Date: Apr 2014

Posts: 29794
#2

09 Mar 2024, 10:55

I am also aware that OLS and FE do not measure exactly the same thing (within variation reported by FE while OLS

This is the key, except you have understated it here. Some of the coefficient differences between FE and OLS here are huge, like an order of magnitude! And a few are of opposite signs. You have data in which the within and between ID effects are very, very different.

The FE model gives you consistent estimates of the within-ID effects of these variables. The OLS model estimates a weighted average of the within and between effects. What you need to do is think clearly about your research question. Does it ask about the within effects? Or does it ask about the between effects? Or both? If the research question is about the within-ID effects, then the OLS model is irrelevant.

If it asks specifically about the between effects, then the OLS model gives an approximation to those, but given how different the within and between effects are, a weighted average of the two might not really be a very good approximation to the pure between effects. So if you need the between-ID effects (either alone or you need both) then I suggest running a Mundlak correlated random effects model. The simplest way to do that is with the -xthybrid- command, available from SSC.
1 like
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2081
#3

09 Mar 2024, 11:47

Other fields view things differently. The key is causality. FE allows the firm heterogeneity to be arbitrarily correlated with the x's, and that's why it's generally preferred to POLS. I wouldn't do any comparison, though, without time effects. A full set of year dummies is almost always included in these kinds of applications.

How many industries do you have? I can point you to a recent paper of mine that can be used to test whether it is sufficient to use industry FEs versus firm FEs. Some of the large differences in the POLS versus FE estimates are on insignificant variables. The differences may be even smaller if you control for time effects.

Here's a link to the paper:

Papke-Wooldridge
2 likes
Comment
Constantin Domizlaff

Join Date: Jun 2021

Posts: 21
#4

09 Mar 2024, 12:41

Thank you for the replies Clyde and Prof. Wooldridge.

@Prof. Wooldridge thank you for the link to the paper.

Some of the large differences in the POLS versus FE estimates are on insignificant variables. The differences may be even smaller if you control for time effects.

I have included time fixed effects dummies in my both the OLS and the FE regression (i.YEAR), as seen above. Does this make a comparison meaningful or should I, based on the answer from Clyde, report only the FE results, given my research question will be something of the likes "Do Board Committee characteristics XYZ have a significant impact on corporate carbon performance?" which indicates an interest in the within-panel effects, if I am not mistaken.

How many industries do you have?

In terms of industries, there are a lot, approx. 60 different industries.

Thanks a lot in advance!
Comment
Constantin Domizlaff

Join Date: Jun 2021

Posts: 21
#5

10 Mar 2024, 13:25

Hey everybody, I really dont want to bother anyone, but it would be great to get a closing thought from someone on my last comment. Thanks a lot!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29794
#6

10 Mar 2024, 13:40

given my research question will be something of the likes "Do Board Committee characteristics XYZ have a significant impact on corporate carbon performance?

is, in my opinion, not precise enough to give you guidance.

It could mean: do those firms whose board committees happen to have characteristics XYZ have different levels of carbon performance than those without characteristics XYZ?

Or it could mean: does a firm whose board committee changes to gain characteristics XYZ have different levels of carbon performance than it did when it didn't have those characteristics?

The former is a between-firms question and the latter is a within-firms question. They may well have different answers, and, in any case, getting the right answer to the wrong question does not advance your goals. From the econometrician's perspective, the latter question would be closer to causal than the former, which is more about correlation than causation--no disagreement across fields about that. From my epidemiological perspective, I would say that even with a two-way fixed effects (TWFE) model, this is still observational data and although TWFE does eliminate much of the concern about confounding by unmeasured variables, it still has gaps and may or may not truly give a causal effect estimate, as the change in characteristics itself may be endogenous in ways that are not overcome by TWFE. And, again, not knowing what your research goals are, it may be that the non-causal between-firms effects are the ones you need. That's a decision you need to make by clarifying exactly what your research question is.
1 like
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2081
#7

10 Mar 2024, 15:30

Again, to me it’s about causality. Using OLS is more likely to lead to spurious causality. If you settle on POLS then you could just as well use a single cross section. The choices of board members could easily be related to carbon performance. In my view, your only hope for determining causality is using FE — and even that’s in doubt.
2 likes
Comment
Constantin Domizlaff

Join Date: Jun 2021

Posts: 21
#8

11 Mar 2024, 04:14

@Clyde my research question would indeed lead to the latter question you mentioned.

And I am also aware of endogeneity concerns, such as that the choices of committees/board members could be related to carbon performance. To address this issue I have thought about also running a GMM model to address this after the FE.

But your answers have helped my a lot. This will surely help me in my further research process! Thanks again!
Comment

Announcement

OLS v. Fixed Effects results reliability

Comment

Comment

Comment

Comment

Comment

Comment

Comment