Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Robust Regression v. Transformation (or both?)

    Hello,

    I have a question about a regression I am running. I ran the regression using the code "regress y x1 x2 x3" but found using the "hettest" and "rvfplot" commands that there are some concerns about heteroskedasticity. As a result, I transformed my dependent variable to ln(y) and ran the regression using the code "regress ln(y) x1 x2 x3". That seemed to fix the problem.

    My question is: Can I also run that equation using robust standard errors? Can I run the regression "regress ln(y) x1 x2 x3, robust"? It is my understanding that when you have heteroskedasticity problems you can solve it by either using the "robust" command or transforming your variables. But can you do both? Or would this screw up the results?

    Thanks,

    Jeffery

  • #2
    I don't think finding heteroskedasticity is, in itself, a good way to do model selection. A linear model for y and the for ln(y) are different. Typically, the case for ln(y) when y > 0 can be made without considering heteroskedasticity, although that can play a role. Typically the linear model for ln(y) gives more natural meanings of the coefficients and fits better (although the latter is a bit tricky). As a bonus, it often does reduce heteroskedasticity and also makes the error more normally distributed. But this is not guaranteed; it is casual empiricism.

    By the way, you are not using "robust regression" properly. "Robust regression" is very different from using OLS and making the standard errors and inference robust to heteroskedasticity.

    You can always make your standard errors robust to heteroskedasticity when using linear regression; it doesn't matter what transformation of y you use. If you are trying to choose between y and ln(y) based purely on statistics, you should use valid goodness of fit measures. I propose some possibilities in my introductory econometrics book. Alternatively, in place of a linear model for ln(y), directly estimate an exponential model for y using

    glm y x1 ... xK, fam(gamma) link(log) robust

    and you can get a sum of squared residuals that is comparable to the linear regression.

    JW

    Comment


    • #3
      Hi Jeff,

      Thank you so much for this helpful comment. It answered my question clearly and gave me some things to think about moving forward. I believe I own a copy of your book (4th edition) from a class I once took, so I'll make sure to consult it as I tinker with different model specifications.

      I was entirely chagrined when I saw I had typed "Robust Regression" in the subject line. I made sure to double check my message to ensure I was clear about using robust standard errors, not robust regressions, but I guess I neglected to double check the subject line.

      Thanks again,

      Jeffrey

      Comment

      Working...
      X