Changing effect directions in fixed-effects estimation compared to OLS.

Constantin Domizlaff

Join Date: Jun 2021

Posts: 21
#1

Changing effect directions in fixed-effects estimation compared to OLS.

25 Feb 2024, 07:24

Dear all,

I am in the following situation: I am running a panel model investigating the impact of board determinants on corporate carbon emissions. I have tested for issues such as heteroscedasticity, multicollinearity since I first want to run a regular OLS using

Code:

reg

regression followed by a fixed effects regression (Hausman test to pick FE over RE has been performed) using

Code:

xtreg, fe

. I have detected problems of autocorrelation and heteroscedasticity and thus I am using clustered standard errors at entity level

Code:

vce(cluster ID)

for both the OLS and FE estimation. Additionally, I have lagged independent and control variables by one period (year). Now to my question:

Some of the coefficients change their effect direction (negative in OLS and then positive in FE and vice versa) when comparing the OLS results to the FE results. I expected changing significance levels but I am not too familiar with changing effect directions. I am aware that omitted variables could be a problem here. Are there any other factors that could be a reason for these changing effect directions?

Thanks a lot in advance!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29962
#2

25 Feb 2024, 10:52

Are there any other factors that could be a reason for these changing effect directions?

Yes. It is a commonly held, but mistaken belief that fixed effects regression and random effects or OLS regression are alternative ways of doing the same thing. They are not. When you do a fixed-effects analysis, you are estimating the within-panel effects of the right hand side variables. When you do random effects or OLS regression you get a weighted average of the within-panel and between panel effects. Sometimes the within and between effects of a variable are different. (I won't go into the details, but in the event that the Hausmann test says you don't need to use fixed effects, it is always the case that the within- and between- effects are the same, or very nearly so.) In common sense terms, I think it is widely understood that the effects of getting married are not always the same as the effects of being married. In data terms, you can run:

Code:

clear set obs 5 gen panel_id = _n expand 2 set seed 1234 by panel_id , sort: gen y = 4*panel_id - _n + 3 + rnormal(0, 0.5) by panel_id: gen x = panel_id + _n xtset panel_id xtreg y x, fe regress y x // GRAPH THE DATA TO SHOW WHAT'S HAPPENING separate y, by(panel_id) graph twoway connect y? x || lfit y x browse

to see an example where the within and between panel effects go in the opposite direction.

The example is, of course, artificial and oversimplified but it demonstrates the point that within and between effects need not be the same, nor even vaguely similar, and it shows how the fixed effects model picks up the within effect, but the OLS regression picks up something else.
2 likes
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#3

25 Feb 2024, 11:08

Cosntantin:
as an aside to Clyde's excellent reply:
1) why starting off with -regress- if you have a panel dataset. In addition, if you have at least 30 panels, you should go -vce(cluster panelid)- otherwise -regress- will consider your observations an independent, whereas they are not, due to the panel structure of your dataset;
2) you wisely imposed non-default standard errors, but they are not supported by -hausman-;
3) for what above, it is not clear how you compared -regress- with -xtreg,fe-.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Constantin Domizlaff

Join Date: Jun 2021

Posts: 21
#4

25 Feb 2024, 12:08

Thank you Clyde and Carlo for your great replies.

@Carlo: Just to clarify, I am using

Code:

regress y x, vce(cluster ID)

to avoid the problem you mentioned in your first point.
Regarding your second point, I know that I cant employ the -vce (cluster ID) standard errors for -hausman-, so I just did without. Is the result of the hausman test still valid or should I use another way of deciding between fixed and random effects?
Regarding your third point, I just compared the outputs to each other and wondered why some effect directions were completely different. But I think Clyde's answer provides an appropriate answer to that question.

@Clyde: Thanks for the answer, that was very helpful. So how I understand it, is it correct that FE provides the entity-specific effects due to the focus on within-entity variation and the OLS provides an average across-entity effect? And is it then sufficient to say that, due to the results of -xttest0- and -hausman- and the fact that heteroscedasticity and autocorrelation are present in the data (OLS assumptions not fulfilled), I would prefer FE over RE and OLS?

Thanks a lot for answering my questions! Best regards!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29962
#5

25 Feb 2024, 14:29

is it correct that FE provides the entity-specific effects due to the focus on within-entity variation and the OLS provides an average across-entity effect?

No, this is not right. FE does not provide entity-specific effects. Entity-specific effects would mean a separate regression slope estimated for each panel--but that is not what FE does. FE provides a single regression slope that characterizes the within-panel effect. That is to say, a regression coefficient from an FE analysis is interpretable as the expected difference in outcome associated with a unit difference within the same panel in the predictor variable. It is also not quite correct to say that OLS provides an average across-entity effect. The estimate provided by OLS is actually a weighted average that includes both across-entity and within-entity effects. To get a pure across-entity effect estimate you have to collapse the data to the mean values of all the variables by panel, and then regress the mean outcome against the mean predictor variables. (The little used -xtreg, be- does this.)

And is it then sufficient to say that, due to the results of -xttest0- and -hausman- and the fact that heteroscedasticity and autocorrelation are present in the data (OLS assumptions not fulfilled), I would prefer FE over RE and OLS?

Heteroscedasticity and autocorrelation have nothing to do with the choice between FE and RE. They do dictate the choice of -vce()-. As for -xttest0 and -hausman-, this is one of my pet peeves. You will frequently see people say that the choice between FE and RE is dictated by the results of hausman (or xtoverid or some other test). This is plain nonsense.

Research is not about the mechanical application of tests and procedures to data. Research is goal directed: all research projects should begin with a stated research question that is to be answered.

If the research question of your project addresses itself to within-panel effects, then -fe- is usually the way to go here. Now, if you run the Hausman (or similar) test in this context and Hausman says that RE is OK, it will always be the case that that test has determined that the within-panel and between-panel effects are the same, or nearly so. And in that case, the RE analysis will produce essentially the same coefficients as FE, and it will be more efficient, in the sense of smaller standard errors. So in that case the use of RE is defensible, and perhaps preferable.

If the research question of your project address itself to between-panel effects, then you cannot use the FE estimator regardless of what Hausman or any other test says. The FE estimator is incapable of estimating between-panel effects, and if those effects differ from the within-panel effects, the FE estimator will give you consistent answers to the wrong question. You must use OLS or RE (or BE).

Sometimes you need to address both the within and between panel effects in your research. If a Hausman (or similar) test says RE is OK, then you also have the (unspoken) assurance that the within and between panel effects in your data are actually the same, or nearly so. And in that case RE gives the most efficient estimate, but the results will be little different from FE's. But if you need both within and between panel effects and Hausman says FE, probably your best bit is the Mundlak correlated random effects estimator. It is easy to code in Stata. Or you can use the -xthybrid- command, available from SSC, for the purpose. This estimator gives, for each of your predictor variables, an estimate of the within panel effect and a separate estimate of the between-panel effect.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17676
#6

26 Feb 2024, 01:05

Costantin:
1) I surmised that you used -regress- with non-default standard errors dealing with N>T panel dataset. My caveat was more about the number of panels in your dataset;
2) testing for model nuisances (such as auticorrealtaion) after -hausman- is not the preferred approach. You should check whether your regression suffers from some issues beforehand;
3) I agree with Clyde that -hausman- (as well as the community-contributed module -xtoverid-) are econometricians' "walking canes"and their outcome should not be taken as written in the stone (but see paragraphs 3.4.1 and 3.4.2 in Hsiao C, Analysis of panel data. 3rd edition. New York: Cambridge, 2014). Hovever, some reviewers may be more theory-driven in their score (if you plan to submit your paper to a technical journal) when -fe- is the way to go, -re- is inconsitent whereas, when -re- is the best option, -fe- is inefficient. Both estimators have their own drawbacks (e.g., the main -re- assumption of no correlation between ui and the vector of predictors is hardly verified; you can test it via the Mundlal approach);
4) as Clyde says, when you're interested in a time-invariant predictor but -fe- is not the way to go, the Mundlal approach comes in handy.

Last edited by Carlo Lazzaro; 26 Feb 2024, 01:08.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Constantin Domizlaff

Join Date: Jun 2021

Posts: 21
#7

26 Feb 2024, 01:30

Thanks a lot for the clarification, Clyde and Carlo. I sometimes read in scientific paper something of the likes "...a Hausman test has been used to determine that FE is preferred over RE...", which is why I was probably stuck with the common belief that you mentioned.

Your explanation makes a lot of sense to me now, and from what I can gather for my type of research the within-panel effect seems to be of interest, since I am investigating a panel of 500 firms and the impact the board of director characteristics have on the carbon performance. A unit difference within the same panel seems to be, since I use firm ID as the panel variable, the more interesting effect for me. Hence, I would probably pick FE as the appropriate method. My research question will be something of the likes "Do certain board characteristics impact carbon performance on a corporate level?". Additionally, I am not especially interested in time-invariant predictors.

Thank you a lot for the detailed clarifications and patience with my questions! Best regards!

Last edited by Constantin Domizlaff; 26 Feb 2024, 01:34.
Comment

Announcement

Changing effect directions in fixed-effects estimation compared to OLS.

Comment

Comment

Comment

Comment

Comment

Comment