Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • qreg vs median from tabstat

    Hello,

    I have a question about qreg -- I'm using a difference-in-difference regression and looking at medians, so I'm using quantile regression.

    As I understand it, the regression is pretty much the usual d-in-d approach:

    qreg scores treatment survey treatment*survey

    Scores is my dependent variable. There is a treatment (0 or 1) and survey identifies baseline (0) or study period (1). The focal variable is the interaction, the treatment group during the study (treatment*survey). There are no covariates here, and my output seems fine, but for some reason the median that tabstat produces is different from the one I get from the quantile regression.

    My tabstat commands are these:

    tabstat scores if treatment==1, by(survey) stat(p50)
    tabstat scores if treatment==0, by(survey) stat(p50)


    For some reason the qreg output differs a little bit from the tabstat output; that is, the number you get by manually taking the difference from the two medians reported by tabstat above (ie, the difference between the before-during change in the control and the before-during change in the treatment) differs from the interaction term in my qreg output.

    It's only a bit off, but it is different, and it's my understanding that the numbers should be the same. When I use OLS regression and compare with means reported by tabstat the problem doesn't emerge. The problem also doesn't emerge when I use a different variable and qreg again.

    Similar issues appear pop up elsewhere, but I didn't see how the questions were resolved:
    https://www.statalist.org/forums/for...other-commands https://www.stata.com/statalist/arch.../msg00866.html
    https://www.stata.com/statalist/arch.../msg01331.html

    In any case, I was wondering if someone might have a sense of what might be going on.

    Thanks!
    David

  • #2
    Dear David,

    I believe that there are two issues here.

    1 - First, when the number of observations is even, the median is not uniquely defined and different Stata commands (e.g., tab and qreg) use different definitions of the median.
    2 - Medians and means have very different properties and you need to think carefully about what you are doing when you do a "dif-in-dif" with medians. In particular, note that the difference of medians is not the median of the difference.

    Best wishes,

    Joao

    Comment


    • #3
      Thanks for this Joao!

      Comment

      Working...
      X