Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Possibly Rather Large Mistake; Factor Variables and Multiple Linear Regression

    Hello, so I thought I was doing a multiple linear regression with the command

    Code:
    reg realgvaperh lifesatisfaction happy anxiety i.regionNum
    I thought this was a multiple linear regression, and I was reading the r-adjusted values, f tests, t tests of coefficients etc. as if they were.I think it turns out that that is not what I've been doing, and therefore my entire dissertation is based on this wrong assumption. Is my assumption really that wrong? How hard am I going to fail my dissertation?

    My thinking was that the multiple linear regression did lots of mini regressions in my panel by basically doing a regression on each bit of regional data (see picture attached below), instead of just one massive blob of data. If that's not what I've done, is there a way to do this, and what is it called?

    Picture : https://i.imgur.com/gRaqPiK.png

    Thanks and please help me,
    An incredibly stressed, and probably very stupid, student

    EDIT : So what I THOUGHT I was doing was fixed effects, so I guess I'll just drink loads of coffee and hardcore edit my dissertation over the next day.
    Last edited by Ben Cunningham; 27 Apr 2019, 14:05.

  • #2
    it would be easier to help you if you read the FAQ on posting, particularly #12. But, to begin, yes your model certainly is a multiple regression. What makes you think it is not? Your model says
    that the association of each of the right side variables with realgvaperh is the same within each region. Put differently but equivalently, your model says that the regional differences you estimate are the same regardless of the values of happy, anxiety etc. Models like this are variously referred to as "additive" or "main effects." That is, you are assuming no interactions.

    Now, I don't know exactly how you got the graph you attached. If you did it properly, the lines for each region should be perfectly parallel given your model. You will notice that they are not. In order to get a proper graph you need to use the margins and marginsplot commands. Rich Williams has a very nice explanation of these commands. See https://www3.nd.edu/~rwilliam/stats/Margins01.pdf

    You need to invest some effort in understanding exactly what is implied by the model, that is, how you interpret the coefficients, and you need to figure out how to get a proper graph.
    Richard T. Campbell
    Emeritus Professor of Biostatistics and Sociology
    University of Illinois at Chicago

    Comment


    • #3
      Just to tie up a loose end in your question

      So what I THOUGHT I was doing was fixed effects
      and in fact you have estimated fixed effects for your regions.

      I see from your previous post that you

      have some panel data, investment in different regions over the course of a 7 years
      which suggests you should have been looking at the Stata Longitudinal-Data/Panel-Data Reference Manual PDF included in your Stata installation and accessible through Stata's Help menu. What you perhaps wanted was
      Code:
      xtset regionNum
      reg realgvaperh lifesatisfaction happy anxiety, fe
      but - other than the explicit inclusion of the phrase "fixed effects" in the description of the model - the results should be the same as your pooled OLS with the region indicators.

      Comment

      Working...
      X