Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • AUROC for Linear Probability Model

    Hello there,

    I am estimating a linear probability model of a rare outcome (97%=0, 3%=1) because I was asked to include fixed effects which is not sensible in a nonlinear model. I know there are other problems with this! The (adjusted/within) r^2 is basically 0. However, that is not too uncommon for such a rare outcome, and the coefficients of interest all have a sensible sign and high statistical significance.

    Given that I use the linear probability model, could you please advise me on how to implement an auroc measure in this setup?
    Thank you!

    Doro

  • #2
    us search to find and download "somersd" - the help file tells you how to get auroc

    not sure what you mean by "I was asked to include fixed effects which is not sensible in a nonlinear model." - why isn't is sensible (maybe start with what you mean by "fixed effects")

    Comment


    • #3
      Originally posted by Doro Kuebler View Post
      Given that I use the linear probability model, could you please advise me on how to implement an auroc measure in this setup?
      Code:
      regress binary_outcome i.fixed_effect1 c.fixed_effect2
      predict double xb, xb
      roctab binary_outcome xb
      display in smcl as text "ROC AUC = " as result %04.2f r(area)

      Comment


      • #4
        Thank you so far. It's not yet working, at least not as far as I understand it

        Rich's suggestions with somersd yields me an output table with lots of coefficients and confidence intervals. However, I am only interested in this one number, the AUROC for a linear regression of all my covariates on the one dependent 0/1 variable. I do not understand how to get from here to there. Yes, I have read the hlp and pdf files for somersd.

        Joseph's suggestion produces error 134, too many values when using roctab. I have 600,000 obs entering the regression (on Stata 14MP though)

        edit: to Rich's question: what I mean with "fixed effects" is a within-transformation equivalent action, depending on the command I either use [reghdfe..., absorb(..)] or [reg ... i.x i.y]. Why I do not use many dummy variables in a logit model is the incidental parameter problem.

        edit2: the syntax I use is
        somersd `Y' `Z' `X' `FE1' `FE2' , trans(auroc)
        Last edited by Doro Kuebler; 27 May 2016, 07:37.

        Comment


        • #5
          the way that I generally use somersd for auroc is to estimate the logistic regression, obtain the predicted values and use somersd with the predicted values as the only variable (other than the dependent variable of course)

          Comment


          • #6
            Originally posted by Doro Kuebler View Post
            Joseph's suggestion produces error 134, too many values when using roctab. I have 600,000 obs entering the regression
            Maybe try
            Code:
            regress binary_outcome i.x i.y
            predict double xb, xb
            contract binary_outcome xb, freq(count)
            roctab binary_outcome xb [fweight=count]
            I have no idea what a "within-transformation equivalent action" is, but you can use xtlogit . . ., i() fe for fixed-effects logistic regression.

            Comment


            • #7
              thank you so much.

              in fact, even after contracting the obs count is still the same, but when I round the predicted values a bit and then contract, the auroc yielded by roctab is equal to the one when I use Rich's method.

              Comment

              Working...
              X