Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The odds ratio, SD, CI is too small?

    Hello,

    I am doing a logistic to check the odds ratio between binary outcome and continuous variable, the results that I got is very small. For example, the odds ratio is 8.24e-08 and SD is 4.25e-08 where the 95%CI is 6.55e-11 - 0.0001208?

    Is there any better way to solve this issue apart from transform the variable to log? And if the only way to solve it is by log transform it, how can I interpret the results?

    Thanks

  • #2
    what is the range of the continuous variable? does a one unit difference in the continuous variable matter in the real world? consider re-scaling by dividing by something (10, 100, 10000????) - without further information it is very hard to give better advice - please read the FAQ which has very good advice on asking good questions

    Comment


    • #3
      The null value for an odds ratio is 1, not 0. So an odds ratio of 8.24e-08 is not "small" in the sense of almost no effect: it is enormous, but in a reduced probability direction. It says that even small increases in your predictor variable leads to almost no probability of a positive outcome. There are a number of possibilities here. One is that you are modeling an outcome which is almost always 0 but happens to be 1 at some very small values of the continuous predictor. Another possibility is that the scale of the continuous variable is inappropriate for the analysis. And if that is the case, a log transform might make matters worse, not better. Another possibility is some data error(s) that produced a highly influential observation that is distorting everything. To give more concrete advice it would be necessary to see example data, along with descriptive statistics for the outcome and continuous predictor. It would also be helpful to look at the graph produced by -lowess outcome continuous_var, logit-.

      Comment


      • #4
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte group float fps
        1 .1491078
        1 .0410539
        1 .0961356
        1 .0440432
        1 .0485944
        1 .3304636
        1 .0840391
        1 .0613633
        1 .0526481
        1 .1971922
        1 .0738812
        1 .1518053
        0 .0542198
        1 .0945191
        1 .1400147
        1 .0630852
        1 .0508285
        1 .0680228
        1  .083058
        1 .0921024
        1 .0557303
        1 .0480706
        1  .093758
        1 .0895595
        1 .0607143
        1 .0502698
        1 .0596198
        1 .0758054
        1 .0994444
        0 .1289528
        1 .1219793
        1 .1776514
        1 .0722118
        1 .0647861
        1 .0549093
        1 .0696798
        1 .1196871
        1 .1655451
        0 .1120318
        1 .0325733
        1 .0385409
        0 .2013027
        1 .0726375
        1 .0662749
        1 .0774922
        1 .0735908
        1 .0863183
        1  .066577
        1 .0743369
        1 .1016949
        1 .1229856
        0 .0681034
        1 .0626035
        1  .156962
        1 .0557873
        1 .1065125
        1 .1135612
        1 .1207563
        1 .0864312
        1 .1162011
        1 .0905537
        1 .0728972
        1 .1347974
        1 .1386651
        0  .098441
        1 .0577838
        1 .1196481
        1 .1064916
        1 .0630386
        0 .0715643
        0 .0433135
        0 .0718835
        1 .0526946
        1 .1023371
        1 .0632832
        0 .0711144
        1 .0879387
        1 .1018519
        1 .0956392
        0 .0517711
        0 .1035422
        0 .0705446
        1 .1046987
        1 .2906195
        0 .0732839
        1  .050216
        1 .0872954
        1 .1492958
        1 .0494137
        1 .0625889
        0 .1436301
        1 .0839552
        0 .0994152
        1  .066092
        1 .1069559
        0 .0853041
        0 .0585034
        1 .1407767
        1  .058663
        0 .1839599
        end
        Here is a sample of 100 obs of the data. The outcome is group and it is 0,1. The fps variable range is [.0325733,1.645762].

        The command is only allow for 100 observations, and when I try it, the results is still unusual but not as with the original data!



        Many thanks for your help.
        Last edited by Bader Bin Adwan; 03 Apr 2021, 18:29.

        Comment


        • #5
          Well, from these 100 observations, nothing seems glaringly wrong. The outcome variable is a little lopsided, 19 0's and 81 1's, but that's certainly not something that -logistic- can't handle. The fps range from .03 to 1.65 is certainly appropriate, so this is not a scaling issue. I suppose it is possible that in the rest of the data, there are almost no more 1's, and the values of fps are larger than observed in the example. That could produce some problems, especially if the data set is large.

          I ran the lowess plot on the example data: it has a V-shape, which suggests that your logistic model is probably not appropriate to this data, but, again, I don't see anything in it that would account for an OR that close to zero.

          A somewhat easier plot to interpret would be: -dotplot fps, over(group)-. It's possible that will show that the data is pathological when applied to the whole data set: in the example data it looks OK.

          Comment


          • #6
            The data is not large, it has around 700 Obs. In group there are 220 1's and 480 0's. The min and max for fps if group ==1 is 0.032 and 0.44, where it was 0.040 and 1.64 if group==0.

            Here is the dotplot.
            Click image for larger version

Name:	Graph.png
Views:	1
Size:	50.7 KB
ID:	1601393

            Comment


            • #7
              Well, you clearly have an outlier with a very large fps value and group = 0. I suspect that data point is the culprit here. The question is: is that a data error--in which case you must fix it or remove it? Or is it correct data. If that's the case, a logistic model is simply not going to work for this data.

              Comment


              • #8
                I tried and removed that outlier and perform logistic again but the results did not change. When I initially log transform it, the results made sense. I am not sure what is the problem here exactly, guess the values are skewed!

                Thanks

                Comment


                • #9
                  I tried and removed that outlier and perform logistic again but the results did not change.
                  I find that very hard to believe. Maybe it wouldn't entirely solve your problem, but I would expect the results to change a lot. From the looks of the dotplot, I would expect that with the outlier removed the OR would be just a small amount less than 1. Can you show the code and output?

                  Comment


                  • #10
                    Here are the results!


                    Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	25.0 KB
ID:	1601402



                    Click image for larger version

Name:	Graph2.png
Views:	1
Size:	110.6 KB
ID:	1601403

                    Comment


                    • #11
                      Here is the results without omitting the outlier, OBS=696 . The results does not changed at all !

                      Click image for larger version

Name:	Capture2.PNG
Views:	1
Size:	24.6 KB
ID:	1601405

                      Comment


                      • #12
                        OK, now I see it. With that one point removed, we have a scaling problem. A 1 unit change in fps is a change that is almost twice as large as the entire range of the data. The data do have group leaning towards 0 with increasing values of fps, but now when you imagine a unit change in fps, you are extrapolating way beyond the range of the data. It is not reasonable to use fps in its current scale in this model. I suggest you rescale fps by a factor of 10, so that in the new units it ranges from approximately 0 to approximately 5. If you do that, you will get an odds ratio of about 0.3, which is more sensible.

                        Comment


                        • #13
                          So, if I used gen logfps= ln(fps) then run the logistic the odd became 0.16 with 95% 0.11-0.24, is that mean for 10 fold increase in fps the odds is 0.16? Or is there a better way (command) to rescale fps as you suggested?

                          Appreciate your support

                          Comment


                          • #14
                            Well, by using a log transformation you make it more complicated. Your proposed interpretation is way off base.

                            A 10 fold increase in fps means that logfps increases by log(10) = 2.3. An odds ratio of 0.16 corresponds to a logistic regression coefficient of -1.83. So the log odds decreases by 2.3*1.83 or 4.2. Which in turn means that the odds decreases by a factor of exp(4.2) = 0.015. I don't know if that's helpful or if anyone who hasn't gone through the analysis would find it understandable.

                            Is there a reason you don't want to just change the scale of fps by a factor of 10? Just -gen fps10 = 10*fps- and then do the logistic regression that way? That seems a lot simpler to me, and then you'll just be able to say that a 10 unit increase in fps is associated with an odds ratio of (whatever, probably something around 0.3).

                            Anyway, a key thing to remember here is that odds ratios for continuous predictors are slippery: they depend very sensitively on the functional form and scale of the variable! Which also means that you can't really understand such an odds ratio unless you know what the scale of the predictor variable is.

                            Comment

                            Working...
                            X