Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Omitted variables from oligit

    Goodmorning,
    I would be most grateful for some help!
    I tried running an oligit model and I got the below message - when I run a normal regression (regress command) I donot have an issue with the NZBA independent variable.
    Why is this happeningn? Also I am not sure if oligit or regress are more appropriate. My dependent variable NZS can take values from 0 to 4 including halfs, i.e. 2,,5 3, 3,5 etc

    . logit NZS REM ESGCom WOMB PCAF NZBA SIZE LEV ROAA

    note: NZBA != 0 predicts success perfectly;
    NZBA omitted and 43 obs not used.

    Iteration 0: Log likelihood = -78.288548
    Iteration 1: Log likelihood = -66.630426
    Iteration 2: Log likelihood = -64.588214
    Iteration 3: Log likelihood = -64.447594
    Iteration 4: Log likelihood = -64.44693
    Iteration 5: Log likelihood = -64.44693

    thank you

  • #2
    This is a typical complete separation, aka perfect prediction, problem. The message itself says "note: NZBA != 0 predicts success perfectly." In other words, whenever NZBA is anything other than 0, then variable NZS = 1. -logit- estimates the coefficients by maximum likelihood. In this situation where NZS = 1 is 100% guaranteed by some value of a predictor variable, the maximum likelihood estimate of the coefficient is negative inifinity. In practical terms, this means that the estimation cannot possibly converge. So Stata (and all the statistical packages I am familiar with) identify this problem ahead of time and remove the offending variable and the perfectly predicted observations to result in model that can be estimated.

    There are a few approaches you can take in this situation.

    1. Sometimes this results just because you have a relatively small data set and it is just a coincidence that NZBA = 0 always has NZS = 1. In that case, getting more data, including some cases where this constraint does not hold, will allow you to proceed with the variable NZBA included.

    2. If your data set is large enough that this NZBA = 0 implies NZS = 1 relationship is not a coincidence, you might just accept the verdict of -logit-. The interpretation, then, is that no model is needed for NZS in the population of units where NZBA = 0: the outcome is knowable with no real model. And then Stata gives you a model for the rest of the population.

    3. If 2. isn't satisfactory, you can go to a command that doesn't estimate by maximum likelihood. Two such commands come to mind. -exlogistic- is one. It is analogous to using a Fisher exact test instead of chi square for a contingency table, but it is memory and compute expensive unless your data set is pretty small. -firthlogit-, by Joseph Coveney and available from SSC is another. It estimates by penalized maximum likelihood and runs very efficiently.

    4. Do a Bayesian estimate, putting some regularizing prior on the coefficient of NZBA.

    Comment

    Working...
    X