Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computing predicted probabilities after logistic regression

    Good morning,

    Could you, please, help me find the formula used by Stata to compute predicted probabilities after fitting a logistic regression model?
    I tried to guess it from the logit formula but my results (p1 or p2) are different from those obtained by Stata (pr).

    Thanks in advance!
    --
    version 17

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte age double(vitesserisk irrespect sensation danger) byte accidenté
    21                  4  4.833333333333333  4.972222222222222  4.103174603174604 1
    21 3.2000000000000006 3.1666666666666665  4.472222222222222 3.6111111111111107 0
    18  4.333333333333333 2.6666666666666665  4.083333333333333  3.492063492063492 1
    24  4.733333333333333 2.3333333333333335  4.111111111111111   2.76984126984127 0
    22  5.066666666666666 3.6666666666666665  3.027777777777778 1.5317460317460316 1
    23                4.2 2.6666666666666665  2.027777777777778  2.261904761904762 1
    22                4.8  4.833333333333333 3.1111111111111107 2.2698412698412698 1
    20  4.866666666666666                  3 2.5833333333333335 1.8333333333333333 1
    20  4.866666666666666                4.5 2.4444444444444446 2.0555555555555554 1
    24  4.733333333333333 2.3333333333333335 2.5833333333333335 2.4603174603174605 1
    end
    label values accidenté labels3
    label def labels3 0 "Non", modify
    label def labels3 1 "Oui", modify

    Code:
    logistic accidenté age vitesserisk irrespect sensation danger, coef
    predict pr
    predict lp, xb
    gen logitp=1.131809 + (-0.278859*age) + (0.4179795*vitesserisk) + (-0.2565562*irrespect) + (-0.0087661*sensation) + (-0.4366206*danger)
    list pr lp logitp age Éducation revenu emploi vitesserisk irrespect sensation danger accidenté in 1/5
    gen p1=1/(1+exp(-logitp))
    gen p2=100/(1+exp(-logitp))
    list pr lp logitp p1 p2 age vitesserisk irrespect sensation danger accidenté in 1/5




  • #2
    I have some old lecture notes where I illustrate the formula for logit and probit. Using the displayed coefficients may not be very precise due to rounding, so you would rather refer to the coefficient of age as "_b[age]" in the calculation.

    https://speakerdeck.com/andrewm/esti...odels-in-stata

    Comment


    • #3
      Try this, and compare to yours to see where they differ.
      Code:
      version 17.0
      
      clear *
      
      quietly sysuse auto
      
      quietly logit foreign c.(mpg weight), nolog
      predict double pr, pr
      predict double xb, xb
      
      *
      * Begin here
      *
      generate double my_xb = _b[_cons] + _b[weight] * weight + _b[mpg] * mpg
      generate double my_pr =exp(my_xb) / (1 + exp(my_xb)) // invlogit(my_xb)
      
      list pr my_pr xb my_xb in 1/5, noobs
      
      exit

      Comment


      • #4
        \(\hat{p}=\frac{e^{x\beta}}{1+e^{x\beta}}\)
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Dear Andrew, Joseph, and Maarten,

          Thank you so much for your swift responses!

          I realize that there were 2 issues:
          1) I was using the truncated rather than the exact value of the coefficients.
          Andrew Musau: Thanks a lot for reminding me of this good old rule from undergrad biostat lectures!
          2) The formula for invlogit is 1/(1+exp(-logitp)) rather than 1/(1+exp(logitp)).
          Maarten Buis: I can confirm that we need the minus sign in the inverse logit formula.
          Joseph Coveney: Thanks for telling me about the invlogit! The other command would be
          generate double my_pr =exp(my_xb) / (1 + exp(-1*my_xb))
          Have a great day!

          Joseph

          Comment


          • #6
            You don't need the minus sign if you use the right formula: \(\hat{p}=\frac{e^{x\beta}}{1+e^{x\beta}} = \frac{1}{1+e^{-x\beta}}\)

            What we use is the CDF of the logit distribution, which is a symmetric distribution that looks very much like a normal distribution. If you use the form "1/something" instead of "exp(xb)/something", then you look at the upper tail of the distribution instead of the lower tail. But because the logistic distribution is symmetric, you can flip this around again by multiplying the linear predictor by -1. I think of of the "1/something" form as the "exp(xb)/something" form flipped twice, which, since the logistic distribution is symmetric, is the same.
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              Thank you!

              Joseph

              Comment

              Working...
              X