Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Robust standard errors and goodness-of-fit measures

    Dear Statalist community,

    I am using a linear regression model with multiple independent variables. To account for heteroscedasticity in the error term, I use the vce(robust) option after the regress command.
    In the regression output Stata does not provide adjusted R-squared, instead, it only reports R-squared.

    1) Adjusted R-squared is not reported because it is not reliable anymore - why exactly? Is there a publication showing why it is not reliable anymore?

    2) R-squared, F-test and Root MSE are reported. Are they still reliable measures of goodness of fit? Why?

    3) Since adjusted R-squared is not meaningful, how can I now assess the improvement of goodness of fit when adding independent variables to the model?

    Thank you,
    Thomas

  • #2
    There is nothing wrong with R-squared as a goodness-of-fit measure when there is heteroskedasticity. I explain this in Chapter 8 of my introductory econometrics book. The R-squared has to do with unconditional variances, while heteroskedasticity is most properly viewed as nonconstant conditional variances.

    Adusted R-squared simply applies a degrees-of-freedom correction, which is only justified as a way of penalizing parameters. Adjusted R-squared is not unbiased for the population R-squared with or without heteroskedasticity. It seems the decision to leave it out of output with heteroskedasticity-robust standard errors was made for heuristic reasons without any foundation. It too is fine with or without heteroskedasticity. (The most one can say is that both versions of R-squared are consistent for the population R-squared as the sample size grows.)

    Of course, you can obtain Adjusted R-squared by dropping the robust option. It's find to report it with robust standard errors.

    JW

    Comment


    • #3
      An extra perspective on Jeff's very helpful clarification is to underline that the robust option doesn't result in different coefficient estimates, just a different take on their associated uncertainty. So predicted values are the same, R-squared is the same and adjusted R-squared is the same -- with and without the robust option.

      Some of the misconceptions in this territory may stem from the overloading of "robust" in the literature!

      Comment


      • #4
        Agree with Nick on the overloading of "robust"!

        Comment


        • #5
          The preferred term is naturally Eicker-Huber-White-sandwich. Or perhaps Eicker-Huber-sandwich-White. Or for economists White-sandwich-Huber-Eicker.

          Comment


          • #6
            Nick.
            economists like White bread!
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Dear Nick, what you say in #3 about the invarance of the goodness of fit measures with respect to the robust option also apply to the cluster option? Thank you in advance.

              Comment


              • #8
                If your calculations lead to the same predictions then goodness of fit is not affected. The term “badness of fit” is less often seen, but in some fields is more appropriate.

                Comment


                • #9
                  Thank you.

                  Comment

                  Working...
                  X