Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Median regression

    Hi there I have been told alongside my initial regression to estimate a median regression to check if the initial regression, in particular the coefficient on IR, is sensitive to outliers.


    hi there I have the initial regression:

    Code:
     . regress lntobinsq lnassets IR leverage roa cratio rnd div year2016, robust
    in this leverage is insignificant.

    when I estimate the median regression:

    Code:
     . qreg lntobinsq lnassets IR leverage roa cratio rnd div year2016
    leverage becomes significant.


    Generally I was wondering if its normal for coefficients to change from insignificant to significant (and vice versa) from the normal regression to the median regression.


    Any advice would be apprechiated. Thanks.
    Last edited by Prathvajeeth Rajmohan; 09 Sep 2017, 11:30.

  • #2
    Prathvajeeth: There is no necessary reason to expect comparability between OLS regression and median regression—the former is estimating a conditional mean, the latter a conditional median, generally two different parameters—but in my experience one often finds somewhat similar results.

    One immediate suggestion, however, would be to use a robust covariance estimator for obtaining inferences from your median regression model (you used the -robust- option for your OLS regression). That is,
    Code:
    qreg lntobinsq lnassets IR leverage roa cratio rnd div year2016, vce(robust)

    Comment


    • #3
      Double posted and replied at: https://www.statalist.org/forums/for...variable/page2.
      By the way, in both threads Prathvajeeth should have increased the chance of getting more positive replies had outcome tables for OLS and -qreg- been posted (as per FAQ).

      PS: crossed in the cyberspace with John's helpful reply.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Generally I was wondering if its normal for coefficients to change from insignificant to significant (and vice versa) from the normal regression to the median regression.
        Comparing the statistical significance of a variable in different models is, at best, a waste of time. The whole idea of doing it is based on the fallacy that a "significant" result means "there is an effect" and an "insignificant" result means "there is no effect." But that is wrong in any context, and here it just leads you into trouble.

        First, p-values are in fact continuously valued statistics (in most situations) and the application of the 0.05 arbitrary threshold just adds problems. So if your p-valule changes from 0.049 in one model to 0.051 in another, the "statistical significance" has changed. But that is clearly meaningless. What is often not appreciated is that even a large change in p-value, say from 0.001 to 0.010 can be equally meaningless: the regression coefficient itself may have changed by a totally inconsequential amount, but the change was in a region where the function calculating the p-value from it is very steep.

        So if you are trying to explore the robustness of findings across models, as in your post, you should look for similarity of the corresponding coefficients. They will generally not be identical, but if the models are sufficiently similar, they may be reasonably close. Here you are using all of the same predictors. That's key: if you add or remove predictors, there is no reason to expect the results to be close, or even have the same sign. The model difference is predicting the median instead of the mean.of lntobinsq. If the distribution of lntobinsq, conditional on your predictors, is reasonably symmetrical, then the results should be fairly similar.

        Note, however, that this has nothing to do with outliers per se. You could have many outliers, and they could be highly influential, but if they are distributed symmetrically, the results will still be similar. This robustness approach is really about symmetry. If your outliers are asymmetric--all in one tail, say, then you will find different results. But you could also have an asymmetric distribution with no outliers at all, and you will also get different results from -regress- and -qreg-. If you are really looking for for the influence of outliers, you would be better advised to do that with some of the classical regression diagnostic statistics like leverage and Cook's D and the like.

        Comment


        • #5
          thanks guys!

          Comment

          Working...
          X