AUROC for Linear Probability Model

Doro Kuebler

Join Date: Feb 2016

Posts: 11
#1

AUROC for Linear Probability Model

25 May 2016, 08:42

Hello there,

I am estimating a linear probability model of a rare outcome (97%=0, 3%=1) because I was asked to include fixed effects which is not sensible in a nonlinear model. I know there are other problems with this! The (adjusted/within) r^2 is basically 0. However, that is not too uncommon for such a rare outcome, and the coefficients of interest all have a sensible sign and high statistical significance.

Given that I use the linear probability model, could you please advise me on how to implement an auroc measure in this setup?
Thank you!

Doro
Tags: None
Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#2

25 May 2016, 10:56

us search to find and download "somersd" - the help file tells you how to get auroc

not sure what you mean by "I was asked to include fixed effects which is not sensible in a nonlinear model." - why isn't is sensible (maybe start with what you mean by "fixed effects")
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4374
#3

25 May 2016, 17:41

Originally posted by Doro Kuebler View Post

Given that I use the linear probability model, could you please advise me on how to implement an auroc measure in this setup?

Code:

regress binary_outcome i.fixed_effect1 c.fixed_effect2 predict double xb, xb roctab binary_outcome xb display in smcl as text "ROC AUC = " as result %04.2f r(area)
Comment
Doro Kuebler

Join Date: Feb 2016

Posts: 11
#4

27 May 2016, 07:12

Thank you so far. It's not yet working, at least not as far as I understand it

Rich's suggestions with somersd yields me an output table with lots of coefficients and confidence intervals. However, I am only interested in this one number, the AUROC for a linear regression of all my covariates on the one dependent 0/1 variable. I do not understand how to get from here to there. Yes, I have read the hlp and pdf files for somersd.

Joseph's suggestion produces error 134, too many values when using roctab. I have 600,000 obs entering the regression (on Stata 14MP though)

edit: to Rich's question: what I mean with "fixed effects" is a within-transformation equivalent action, depending on the command I either use [reghdfe..., absorb(..)] or [reg ... i.x i.y]. Why I do not use many dummy variables in a logit model is the incidental parameter problem.

edit2: the syntax I use is
somersd `Y' `Z' `X' `FE1' `FE2' , trans(auroc)

Last edited by Doro Kuebler; 27 May 2016, 07:37.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#5

27 May 2016, 11:16

the way that I generally use somersd for auroc is to estimate the logistic regression, obtain the predicted values and use somersd with the predicted values as the only variable (other than the dependent variable of course)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4374
#6

27 May 2016, 19:31

Originally posted by Doro Kuebler View Post

Joseph's suggestion produces error 134, too many values when using roctab. I have 600,000 obs entering the regression

Maybe try

Code:

regress binary_outcome i.x i.y predict double xb, xb contract binary_outcome xb, freq(count) roctab binary_outcome xb [fweight=count]

I have no idea what a "within-transformation equivalent action" is, but you can use xtlogit . . ., i() fe for fixed-effects logistic regression.
Comment
Doro Kuebler

Join Date: Feb 2016

Posts: 11
#7

31 May 2016, 09:56

thank you so much.

in fact, even after contracting the obs count is still the same, but when I round the predicted values a bit and then contract, the auroc yielded by roctab is equal to the one when I use Rich's method.
Comment

Announcement

AUROC for Linear Probability Model

Comment

Comment

Comment

Comment

Comment

Comment