Unexplainable large Odds ratio in logistic regression

Klaus Klausen

Join Date: Mar 2021
Posts: 72

Unexplainable large Odds ratio in logistic regression

01 Sep 2022, 05:14

Hi,

when I run a logistic regression one of my control variables, the return on asset of a company, has an insanely large Odds ratio (I omitted the rest of the regression output and will post it upon request.)

Code:

--------------------------------------------------------------------------------------------------------------------------------------------
                                                                           |               Robust
                                                         AcquirerInitiated | Odds ratio   std. err.      z    P>|z|     [95% conf. interval]
---------------------------------------------------------------------------+----------------------------------------------------------------
                                                               Acq_ROA_WWU |   650097.5    4809464     1.81   0.070     .3278942    1.29e+12

Code:

. sum Acq_ROA_WWU if !missing(p2) //The variable p2 tags the observations that were considered in the regression

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
 Acq_ROA_WWU |        265    .0452223    .0816598  -.3572705    .244479

So far, I did not see any issues that may cause this behavior. I included Acq_ROA_WWU as a control in other regression which doesn't cause any problems. Is there some obvious reason for this?
I tried to keep the post as simple as possible without getting lost into details. Please let me know if you need more specific information on my model.

Thanks.

Tags: None

Rich Goldstein

Join Date: Mar 2014

Posts: 4464
#2

01 Sep 2022, 05:37

the OR is for a 1 unit difference in the value of the predictor/control; however, that is impossible given your data which, as you shows, only ranges from -0.36 to +0.24 - try re-scaling your predictor in a fashion that makes substantive sense (maybe multiply by 100?????)
3 likes
Comment
Klaus Klausen

Join Date: Mar 2021

Posts: 72
#3

01 Sep 2022, 06:14

Thanks for the helpful advice, Rich
Comment
Ken Chui

Join Date: Aug 2014

Posts: 1058
#4

01 Sep 2022, 06:28

Another possibility:

Recalling that 2x2 table example where we learned how to come up with the simplest odds ratio (OR):
Outcome yes Outcome no

Exposure yes A B

Exposure no C D

The OR is computed as AD / BC. When B or C approaches 0, the OR became huge. In fact, when both B and C are zero (aka, the exposure is 100% associated with the outcome), OR is not defined. It's kind of an interesting dilemma with OR.

This can also happen to continuous variable, known as "complete separation". Somehow in some sub-group the outcome probably got nearly perfectly predicted, either due to i) the exposure was indeed so apparently effective and essential, or ii) small cell counts.

I could be check by plotting the outcome 1/0 against the variable to see if there is a good "overlap" of the two parallel lines. If there is a big gap, then it could be the issue. Generally, if it is not the OR of your main interest, it may not need to be remediated. It could just mean that this particular predictor (alone, or when present with some other predictors) may not be very "helpful" because it's doing the job too well.

Last edited by Ken Chui; 01 Sep 2022, 06:38.
1 like
Comment

	Outcome yes	Outcome no
Exposure yes	A	B
Exposure no	C	D

Announcement

Unexplainable large Odds ratio in logistic regression

Comment

Comment

Comment