Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Probit vs Logit Model

    Hi guys,
    I am currently doing a project on observing the effects of money (wages relative to the median and total transfer spending relative to the median) has on winning a trophy within English football (Panel Dataset) . I plan to use binary dependent variables to represent whether a team has won a trophy or not (y=1 if they have won a trophy). I was wondering as to whether I should use a Logit model or a Probit model?

    Thanks

  • #2
    It is largely a matter of taste, or what is traditional in your discipline. The logistic and probit distributions are very close to each other, and differ appreciably only in the very far tails. Unless you are working on a data set with a huge number of observations and lots of outliers, you will reach the same conclusions* either way. The logistic model has the advantage that if you exponentiate the coefficients, you get odds ratios, which are statistics that people can understand with minimal to modest effort. Probit regression coefficients have no analogous interpretation and can be very difficult for people to grasp if they are not accustomed to working with them.

    *When I say the same conclusions, what I mean is that effects that are large in one model will generally be large in the other. If you are doing hypothesis tests, you will likely find the p-values are nearly identical with either model. The coefficients, however, will be different. This is because the probit (normal) distribution has a variance of 1, whereas the logistic distribution has variance pi2/3. Consequently the corresponding coefficients from the two models will tend to be approximately proportional with a ratio of pi/sqrt(3), i.e. approximately 1.8.

    Comment


    • #3
      Leyea:
      I fail to follow your statements.
      You have (or should have) compared -xtlogit,fe- vs -xtlogit,re- via -hausman-. If the -hausman- outcome pointed you out towards -re- specification, it relates to the whole panel data regression model, not the regressand only.
      As an aside, most of this possible misunderstandings could be easily avoided by posting what the original poster typed and got back from Stata (as recommended by FAQ)-.
      Being statistics a quantitative methods issue, numbers and Stata codes/outcomes are far more informative than words.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment

      Working...
      X