Computing predicted probabilities after logistic regression

Joseph KAMTCHUM TATUENE

Join Date: Oct 2021
Posts: 7

Computing predicted probabilities after logistic regression

19 Oct 2021, 05:33

Good morning,

Could you, please, help me find the formula used by Stata to compute predicted probabilities after fitting a logistic regression model?
I tried to guess it from the logit formula but my results (p1 or p2) are different from those obtained by Stata (pr).

Thanks in advance!
--
version 17

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte age double(vitesserisk irrespect sensation danger) byte accidenté
21                  4  4.833333333333333  4.972222222222222  4.103174603174604 1
21 3.2000000000000006 3.1666666666666665  4.472222222222222 3.6111111111111107 0
18  4.333333333333333 2.6666666666666665  4.083333333333333  3.492063492063492 1
24  4.733333333333333 2.3333333333333335  4.111111111111111   2.76984126984127 0
22  5.066666666666666 3.6666666666666665  3.027777777777778 1.5317460317460316 1
23                4.2 2.6666666666666665  2.027777777777778  2.261904761904762 1
22                4.8  4.833333333333333 3.1111111111111107 2.2698412698412698 1
20  4.866666666666666                  3 2.5833333333333335 1.8333333333333333 1
20  4.866666666666666                4.5 2.4444444444444446 2.0555555555555554 1
24  4.733333333333333 2.3333333333333335 2.5833333333333335 2.4603174603174605 1
end
label values accidenté labels3
label def labels3 0 "Non", modify
label def labels3 1 "Oui", modify

Code:

logistic accidenté age vitesserisk irrespect sensation danger, coef
predict pr
predict lp, xb
gen logitp=1.131809 + (-0.278859*age) + (0.4179795*vitesserisk) + (-0.2565562*irrespect) + (-0.0087661*sensation) + (-0.4366206*danger)
list pr lp logitp age Éducation revenu emploi vitesserisk irrespect sensation danger accidenté in 1/5
gen p1=1/(1+exp(-logitp))
gen p2=100/(1+exp(-logitp))
list pr lp logitp p1 p2 age vitesserisk irrespect sensation danger accidenté in 1/5

Tags: logistic, logit, predicted probabilities, regression

Andrew Musau

Join Date: Oct 2014

Posts: 10083
#2

19 Oct 2021, 05:51

I have some old lecture notes where I illustrate the formula for logit and probit. Using the displayed coefficients may not be very precise due to rounding, so you would rather refer to the coefficient of age as "_b[age]" in the calculation.

https://speakerdeck.com/andrewm/esti...odels-in-stata
1 like
Comment

Joseph Coveney

Join Date: Apr 2014
Posts: 4374

19 Oct 2021, 05:51

Try this, and compare to yours to see where they differ.

Code:

version 17.0

clear *

quietly sysuse auto

quietly logit foreign c.(mpg weight), nolog
predict double pr, pr
predict double xb, xb

*
* Begin here
*
generate double my_xb = _b[_cons] + _b[weight] * weight + _b[mpg] * mpg
generate double my_pr =exp(my_xb) / (1 + exp(my_xb)) // invlogit(my_xb)

list pr my_pr xb my_xb in 1/5, noobs

exit

Comment

Maarten Buis

Join Date: Mar 2014

Posts: 3426
#4

19 Oct 2021, 05:53

\(\hat{p}=\frac{e^{x\beta}}{1+e^{x\beta}}\)

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Joseph KAMTCHUM TATUENE

Join Date: Oct 2021

Posts: 7
#5

19 Oct 2021, 09:56

Dear Andrew, Joseph, and Maarten,

Thank you so much for your swift responses!

I realize that there were 2 issues:
1) I was using the truncated rather than the exact value of the coefficients.
Andrew Musau: Thanks a lot for reminding me of this good old rule from undergrad biostat lectures!
2) The formula for invlogit is 1/(1+exp(-logitp)) rather than 1/(1+exp(logitp)).
Maarten Buis: I can confirm that we need the minus sign in the inverse logit formula.
Joseph Coveney: Thanks for telling me about the invlogit! The other command would be
generate double my_pr =exp(my_xb) / (1 + exp(-1*my_xb))
Have a great day!

Joseph
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3426
#6

19 Oct 2021, 10:47

You don't need the minus sign if you use the right formula: \(\hat{p}=\frac{e^{x\beta}}{1+e^{x\beta}} = \frac{1}{1+e^{-x\beta}}\)

What we use is the CDF of the logit distribution, which is a symmetric distribution that looks very much like a normal distribution. If you use the form "1/something" instead of "exp(xb)/something", then you look at the upper tail of the distribution instead of the lower tail. But because the logistic distribution is symmetric, you can flip this around again by multiplying the linear predictor by -1. I think of of the "1/something" form as the "exp(xb)/something" form flipped twice, which, since the logistic distribution is symmetric, is the same.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Joseph KAMTCHUM TATUENE

Join Date: Oct 2021

Posts: 7
#7

19 Oct 2021, 10:55

Thank you!

Joseph
Comment

Announcement

Computing predicted probabilities after logistic regression

Comment

Comment

Comment

Comment

Comment

Comment