Possibly Rather Large Mistake; Factor Variables and Multiple Linear Regression

Ben Cunningham

Join Date: Apr 2019

Posts: 14
#1

Possibly Rather Large Mistake; Factor Variables and Multiple Linear Regression

27 Apr 2019, 13:58

Hello, so I thought I was doing a multiple linear regression with the command

Code:

reg realgvaperh lifesatisfaction happy anxiety i.regionNum

I thought this was a multiple linear regression, and I was reading the r-adjusted values, f tests, t tests of coefficients etc. as if they were.I think it turns out that that is not what I've been doing, and therefore my entire dissertation is based on this wrong assumption. Is my assumption really that wrong? How hard am I going to fail my dissertation?

My thinking was that the multiple linear regression did lots of mini regressions in my panel by basically doing a regression on each bit of regional data (see picture attached below), instead of just one massive blob of data. If that's not what I've done, is there a way to do this, and what is it called?

Picture : https://i.imgur.com/gRaqPiK.png

Thanks and please help me,
An incredibly stressed, and probably very stupid, student

EDIT : So what I THOUGHT I was doing was fixed effects, so I guess I'll just drink loads of coffee and hardcore edit my dissertation over the next day.

Last edited by Ben Cunningham; 27 Apr 2019, 14:05.
Tags: None
Dick Campbell

Join Date: Apr 2014

Posts: 279
#2

27 Apr 2019, 14:32

it would be easier to help you if you read the FAQ on posting, particularly #12. But, to begin, yes your model certainly is a multiple regression. What makes you think it is not? Your model says
that the association of each of the right side variables with realgvaperh is the same within each region. Put differently but equivalently, your model says that the regional differences you estimate are the same regardless of the values of happy, anxiety etc. Models like this are variously referred to as "additive" or "main effects." That is, you are assuming no interactions.

Now, I don't know exactly how you got the graph you attached. If you did it properly, the lines for each region should be perfectly parallel given your model. You will notice that they are not. In order to get a proper graph you need to use the margins and marginsplot commands. Rich Williams has a very nice explanation of these commands. See https://www3.nd.edu/~rwilliam/stats/Margins01.pdf

You need to invest some effort in understanding exactly what is implied by the model, that is, how you interpret the coefficients, and you need to figure out how to get a proper graph.

Richard T. Campbell
Emeritus Professor of Biostatistics and Sociology
University of Illinois at Chicago
1 like
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

27 Apr 2019, 14:53

Just to tie up a loose end in your question

So what I THOUGHT I was doing was fixed effects

and in fact you have estimated fixed effects for your regions.

I see from your previous post that you

have some panel data, investment in different regions over the course of a 7 years

which suggests you should have been looking at the Stata Longitudinal-Data/Panel-Data Reference Manual PDF included in your Stata installation and accessible through Stata's Help menu. What you perhaps wanted was

Code:

xtset regionNum reg realgvaperh lifesatisfaction happy anxiety, fe

but - other than the explicit inclusion of the phrase "fixed effects" in the description of the model - the results should be the same as your pooled OLS with the region indicators.
Comment

Announcement

Possibly Rather Large Mistake; Factor Variables and Multiple Linear Regression

Comment

Comment