Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Odds ratios and Interaction effects Interpretation

    Hi All,

    I have having some trouble interpreting some interaction effects of some panel data ordinal regression models I ran.

    I am using STATA/SE 15.

    I ran dataex for my variables of interest (shown below) as an example of my data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float newid int assessmentnumber float(nature selfdiag lonely)
    1  0 0 2 .
    1  2 0 2 3
    1  3 0 2 1
    1  4 1 2 1
    1  5 0 2 3
    1  6 1 2 1
    1  8 0 2 2
    1 11 0 2 3
    4  0 0 1 .
    8  0 0 1 .
    8  3 1 1 5
    8  5 1 1 5
    8  7 1 1 5
    8  8 1 1 5
    8  9 0 1 5
    8 13 1 1 1
    8 21 1 1 5
    8 23 0 1 5
    8 36 0 1 2
    3  0 0 2 .
    3  1 1 2 4
    3  2 0 2 3
    3  5 1 2 5
    9  0 0 2 .
    9  1 1 2 3
    9  9 1 2 5
    9 22 1 2 5
    5  0 0 1 .
    7  0 0 2 .
    7  3 1 2 2
    7  4 1 2 5
    7  5 0 2 3
    6  0 0 2 .
    2  0 0 2 .
    end
    label values selfdiag noyes
    label def noyes 1 "No", modify
    label def noyes 2 "Yes", modify
    My data is panel data, with multiple participants responding to multiple assessments.

    I'm trying to explore the interaction effect between diagnosis (selfdiag: yes or no *answered once by participants*) and loneliness (lonely: ordered likert 1-5 *answered at every timepoint*) and exposure to nature (nature: yes no *answered at every timepoint*).

    I ran the following xtologit command to explore my query but this is where I need a bit of help with interpretation of the odds ratios.

    Code:
    xtologit lonely i.selfdiag##i.nature, or
    
    Fitting comparison model:
    
    Iteration 0:   log likelihood = -11940.828  
    Iteration 1:   log likelihood = -11853.136  
    Iteration 2:   log likelihood = -11853.023  
    Iteration 3:   log likelihood = -11853.023  
    
    Refining starting values:
    
    Grid node 0:   log likelihood = -9989.8616
    
    Fitting full model:
    
    Iteration 0:   log likelihood = -9989.8616  
    Iteration 1:   log likelihood = -9574.7404  
    Iteration 2:   log likelihood = -9532.0663  
    Iteration 3:   log likelihood = -9526.9694  
    Iteration 4:   log likelihood = -9526.7409  
    Iteration 5:   log likelihood = -9526.7408  
    
    Random-effects ordered logistic regression      Number of obs     =      9,575
    Group variable: newid                           Number of groups  =        339
    
    Random effects u_i ~ Gaussian                   Obs per group:
                                                                  min =         21
                                                                  avg =       28.2
                                                                  max =         42
    
    Integration method: mvaghermite                 Integration pts.  =         12
    
                                                    Wald chi2(3)      =      56.33
    Log likelihood  = -9526.7408                    Prob > chi2       =     0.0000
    
    ---------------------------------------------------------------------------------
             lonely | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
                    |
           selfdiag |
               Yes  |   1.558753   .4191598     1.65   0.099     .9202021    2.640411
           1.nature |   .6323547   .0426758    -6.79   0.000     .5540075    .7217816
                    |
    selfdiag#nature |
             Yes#1  |   1.301939   .1564135     2.20   0.028     1.028794    1.647605
    ----------------+----------------------------------------------------------------
              /cut1 |   .0264557   .1458407                     -.2593867    .3122982
              /cut2 |   1.834789   .1471281                      1.546424    2.123155
              /cut3 |   3.232257   .1503294                      2.937617    3.526897
              /cut4 |   5.114032   .1645409                      4.791537    5.436526
    ----------------+----------------------------------------------------------------
          /sigma2_u |   4.156539   .3770802                      3.479454    4.965382
    ---------------------------------------------------------------------------------
    Note: Estimates are transformed only in the first equation.
    LR test vs. ologit model: chibar2(01) = 4652.56       Prob >= chibar2 = 0.0000
    The main effect of a diagnosis here is non-significant, and the main effect of nature is significant.
    I can see that there is a significant interaction effect, but what does this mean?

    Having a diagnosis (yes) and exposure to nature (1) increases the odds of higher loneliness by 1.30 times? Compared to what? Compared to having no diagnosis (no) and no exposure to nature (0)?

    Am I correct in thinking that compared to no diagnosis/no nature:
    • Diagnosis/no nature = 1.55 times increased odds of higher loneliness
    • Diagnosis/nature = 1.30 times increased odds of higher loneliness
    • No diagnosis/no nature = reference
    • No diagnosis/nature = 0.63 times decreased odds of higher loneliness
    Would this mean that people with diagnosis are at increased odds of higher lonely scores, but these odds are reduced in contact with nature??

    Thanks for any help you can provide!

    Kind regards,
    Ryan

  • #2
    Hi Everyone,

    I was wondering if anyone had a chance to read through my post and has any advice on this?
    I am just hoping to figure out how to interpret this so that I can understand interactions/odds ratios for the future!

    Kind regards,
    Ryan

    Comment


    • #3
      The main effect of a diagnosis here is non-significant, and the main effect of nature is significant.
      I can see that there is a significant interaction effect, but what does this mean?
      None of this means anything. The concept of statistical significance is a snake pit to start with, and the American Statistical Association now recommends it be abandoned altogether. See https://www.tandfonline.com/doi/full...5.2019.1583913. But even if you want to keep using the concept, the statistical significance of these coefficients by themselves has no tangible interpretation whatsoever, never did.

      Having a diagnosis (yes) and exposure to nature (1) increases the odds of higher loneliness by 1.30 times? Compared to what? Compared to having no diagnosis (no) and no exposure to nature (0)?
      No. That term in the output, which is a ratio of odds ratios, not an odds ratio, has no simple tangible interpretation.

      Am I correct in thinking that compared to no diagnosis/no nature:
      • Diagnosis/no nature = 1.55 times increased odds of higher loneliness
      • Diagnosis/nature = 1.30 times increased odds of higher loneliness
      • No diagnosis/no nature = reference
      • No diagnosis/nature = 0.63 times decreased odds of higher loneliness
      Well, I'm not sure what your / notation is supposed to mean here. But those with a diagnosis and no nature have 1.55 times as great an odds of higher loneliness as those with neither diagnosis nor nature. When we look at those with both diagnosis and nature exposures, they have 1.3 times as great an odds of higher loneliness as those with only one of those. I think the way to look at this, however, is by calculating the odds ratios of higher loneliness in each of the four combinations:

      No diagnosis and no nature: reference group
      Diagnosis and no nature: Odds ratio = 1.56 compared to reference
      Nature and no diagnosis: Odds ratio = 0.63 compared to reference
      Nature and diagnosis: Odds ratio = 1.56*0.63*1.30 (= 1.28)

      Would this mean that people with diagnosis are at increased odds of higher lonely scores, but these odds are reduced in contact with nature??
      So compared to those with neither exposure, those with diagnosis have increased odds of higher lonely scores, by a factor of 1.56. If they are also exposed to nature, then that factor decreases to 1.28--which means they are still at increased odds of higher lonely scores, but not by quite as much.

      You might find it easier to understand these results if you look at predicted probabilities of each level of loneliness in all four groups. Odds ratios are not intuitive to people until they work with them for a long time--and even then some people fail to grasp them clearly. Probabilities are more easily understood. Try running:

      [code]
      margins selfdiag#nature
      [code]

      Comment


      • #4
        I am new to stata and just came across this trail of posts and find it very helpful. Two follow-up questions to professor Schechter 1) Nature and diagnosis: Odds ratio = 1.56*0.63*1.30 (= 1.28), is odds ratio 1.28 compared to reference, like the previous two rows (OR 1.56 and OR 0.63)? 2) Is there a stata command to output directly 1.28?; 3) What is the meaning of
        selfdiag#nature | Yes#1 | 1.301939

        Comment


        • #5
          1) Nature and diagnosis: Odds ratio = 1.56*0.63*1.30 (= 1.28), is odds ratio 1.28 compared to reference, like the previous two rows (OR 1.56 and OR 0.63)?
          Yes, it is relative to the joint reference category of nature == 0 and diagnosis == "No".

          2) Is there a stata command to output directly 1.28?
          Code:
          lincom 1.selfdiag + 1.nature + 1.selfdiag#1.nature, or
          In fact, this gives you not just the odds ratio of 1.28 but also its standard error, z-statistic, p-value, and confidence interval.

          3) What is the meaning of
          selfdiag#nature | Yes#1 | 1.301939
          This is the interaction effect in the odds ratio metric. In a linear regression, a term like selfdiag#nature would be the difference between the effect of nature when selfdiag = No and selfdiag = Yes. Note that the effect of nature (conditional on some specified value of selfdiag) is itself a difference in outcome between the expected outcome when nature = 1 and the expected outcome when nature = 0. So in a linear model, this coefficient is a difference-in-differences. Analogous to that, in a logistic regression, given that odds ratios are multiplicative, not additive effects, this result is the ratio of odds ratios (sometimes abbreviated ROR).

          Now, odds ratios are difficult for many people to understand (and many people who think they understand them actually don't.) Ratios of odds ratios can be mind-boggling. So, for this reason, among others, in logistic models with interactions we often don't report the ratio of odds ratios. Instead we might go back to the probability metric of the outcome and report a difference in differences. That cannot be gleaned from the -logistic- output itself: to get that one would have to use the -margins- command. You will see it done both ways.

          Comment


          • #6
            Thanks so much Professor Schechter for explaining this to me.

            selfdiag#nature | Yes#1 | 1.301939, it matters which term is placed first? difference of effect of nature when selfdiag = No and Yes. Would nature#selfdiag be difference of effect of selfdiag when nature = No and Yes?

            How do we know the reference of selfdiag and nature both No, how would this situation relate to loneliness?
            Last edited by Jian Wang; 17 Aug 2023, 10:31.

            Comment


            • #7
              Would nature#selfdiag be difference of effect of selfdiag when nature = No and Yes?
              No, there is no substantive difference between nature#selfdiag and selfdiag#nature. And, written either way, it shows the ratio of odds ratios for nature when selfdiag = 1 vs selfdiag = 0. It also shows the ratio of odds ratios for selfdiag when nature = 1 vs nature = 0. These two ratios of odds ratios are always the same.

              Comment


              • #8
                Professor Schechter, I greatly appreciate your answers above. This is very helpful for my current project. I have found a public dataset just like my research dataset and I'd like to use it to ask a few further questions, along the line of above discussions. I will post my questions here but if you think I should start a new post, I will do that. Here are the questions. The codes run in Stata/MP 18.0 and my questions are labelled as Q1, Q2 etc.


                *** Antpsychotic drugs effective for patients with schizophrenia?

                use https://www.stata-press.com/data/mlmus4/schiz, clear

                dataex



                /*
                . dataex

                ----------------------- copy starting from the next line -----------------------
                [CODE]
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input float(id imps week treatment sex)
                1103 5.5 0 1 1
                1103 3 1 1 1
                1103 2.5 3 1 1
                1103 4 6 1 1
                1104 6 0 1 1
                1104 3 1 1 1
                1104 1.5 3 1 1
                1104 2.5 6 1 1
                1105 4 0 1 1
                1105 3 1 1 1
                1105 1 3 1 1
                1106 3 0 1 1

                */


                set seed 123

                generate impso = imps
                recode impso -9=. 1/2.4=1 2.5/4.4=2 4.5/5.4=3 5.5/7=4
                recode week 0=0 1=24 2=36 3=52 4=64 5=76 6=90


                xtset id week
                xtdescribe if !missing(impso)
                table week treatment, statistic(count impso)

                ************************************************** *****************************************
                **** Step 1: random-intercept ordinal longitudinal model, week as a continuous variable
                ************************************************** *****************************************

                meologit impso c.week##i.treatment || id: , ///
                or covariance(unstructured)
                estimate store m1

                /*

                Wald chi2(3) = 468.14
                Log likelihood = -1718.1417 Prob > chi2 = 0.0000
                ----------------------------------------------------------------------------------
                impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
                -----------------+----------------------------------------------------------------
                week | .979565 .0034883 -5.80 0.000 .9727518 .9864259
                1.treatment | .7116673 .2084615 -1.16 0.246 .4008143 1.263604
                |
                treatment#c.week |
                1 | .9696156 .0040077 -7.47 0.000 .9617925 .9775024
                -----------------+----------------------------------------------------------------
                /cut1 | -5.647717 .3144359 -6.264 -5.031434
                /cut2 | -2.620962 .2710405 -3.152191 -2.089732
                /cut3 | -.580448 .2575332 -1.085204 -.0756922
                -----------------+----------------------------------------------------------------
                id |
                var(_cons)| 3.56976 .4449255 2.796066 4.55754
                ----------------------------------------------------------------------------------
                Note: Estimates are transformed only in the first equation to odds ratios.
                LR test vs. ologit model: chibar2(01) = 336.44 Prob >= chibar2 = 0.0000


                */


                ** week as continuous variable, but the unit = one week
                // I care about the odds ratio output at 52, 72 and 90 weeks, using the lincom
                lincom 52* 1.treatment#c.week, or

                /*


                ------------------------------------------------------------------------------
                impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
                -------------+----------------------------------------------------------------
                (1) | .2009923 .043199 -7.47 0.000 .131896 .306286
                ------------------------------------------------------------------------------

                */

                // Q1 Is this correct calculation of odds ratio for week 52 above?
                // Q2 correct? intepretation: For treated patients at week 52, the odds of higher serverity score vs lower score is 0.2 in comparsion to placebo patients at week 52

                lincom week + 1.treatment + 52* 1.treatment#c.week, or

                /*

                ------------------------------------------------------------------------------
                impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
                -------------+----------------------------------------------------------------
                (1) | .1401166 .0368463 -7.47 0.000 .0836855 .2346006
                ------------------------------------------------------------------------------

                */

                // Q3 correct? interpretation: For treated patients at week 52, the odds of higher serverity score vs lower score is 0.14 in comparsion of placebo patients at week 0

                ************************************************** *********************************
                *** Step 2: random-intercept ordinal longitudinal model, week as a cateorical variable
                ************************************************** *********************************

                ** week as cateorical variable becasue the weeks are not evenly distributed, eg. 36 and 52, 72 and 90

                meologit impso i.week##i.treatment || id: , ///
                or covariance(unstructured)
                estimate store m2

                /*
                Log likelihood = -1697.1182 Prob > chi2 = 0.0000
                --------------------------------------------------------------------------------
                impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
                ---------------+----------------------------------------------------------------
                week |
                24 | .4390883 .1339682 -2.70 0.007 .2414603 .798469
                36 | 1.556353 2.171234 0.32 0.751 .1010685 23.96628
                52 | .2893764 .0916327 -3.92 0.000 .1555697 .538271
                64 | .2330608 .3900584 -0.87 0.384 .0087674 6.195394
                76 | .1053475 .1789698 -1.32 0.185 .0037719 2.942319
                90 | .1410499 .048347 -5.71 0.000 .0720462 .2761435
                |
                1.treatment | 1.004031 .3394933 0.01 0.991 .5175238 1.947889
                |
                week#treatment |
                24 1 | .269669 .0945759 -3.74 0.000 .1356143 .5362369
                36 1 | .0152256 .0244867 -2.60 0.009 .0006511 .3560601
                52 1 | .1228219 .0450767 -5.71 0.000 .0598243 .2521586
                64 1 | .0205406 .038285 -2.08 0.037 .0005322 .7927546
                76 1 | .1436847 .2762295 -1.01 0.313 .0033189 6.220427
                90 1 | .0529062 .0210895 -7.37 0.000 .0242215 .1155613
                ---------------+----------------------------------------------------------------
                /cut1 | -5.864446 .3498227 -6.550086 -5.178806
                /cut2 | -2.821536 .3098026 -3.428738 -2.214334
                /cut3 | -.6987083 .2947749 -1.276456 -.1209601
                ---------------+----------------------------------------------------------------
                id |
                var(_cons)| 3.748761 .4645762 2.940357 4.779422
                --------------------------------------------------------------------------------

                */

                // Q4? week#treatment 52 .1228219, this value is very different from the Q1 I calculated above in step 1, 0.2, the value here is closer to Q3 calculation, very strange
                // Q5 interpretation: For treated patients at week 52, the odds of higher serverity score vs lower score is 0.12 in comparsion of placebo patients at week 52. But the different values from continous variable model is troubling

                lincom 1.treatment + 52.week#1.treatment, or
                /*
                ------------------------------------------------------------------------------
                impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
                -------------+----------------------------------------------------------------
                (1) | .123317 .0424487 -6.08 0.000 .0628086 .2421181
                ------------------------------------------------------------------------------

                */

                //Q6 Correct? Interpretation: For treated patients at week 52, the odds of higher serverity score vs lower score is 0.12 in comparsion of placebo patients at week 0

                ************************************************** *****************
                *** Step 3 calculate probabilities using margins and use marginsplot
                ************************************************** *****************

                margins treatment, at(week=(24 36 52 64 76 90))

                // Q7 The plots are very busy, I'd like to put each impso level on subgraphs
                marginsplot, xdimension(at(week)) allxlabels recast(line) ciopt(lcolor(gray)) ///
                plot1opts(lcolor(green) lpattern(dash)) ///
                plot2opts(lcolor(blue) lpattern(dash)) ///
                plot3opts(lcolor(red) lpattern(dash)) ///
                plot4opts(lcolor(green) lpattern(solid)) ///
                plot5opts(lcolor(blue) lpattern(solid)) ///
                plot6opts(lcolor(red) lpattern(solid))

                ************************************************** *********************
                *** Step 4 plot odds ratios
                ************************************************** *********************

                // Q8 I see people plot odds ratios, essentially two lines, y axis is the odds ratio, x axis is the weeks
                // one line for treatment 0 another for treatment 1
                // assuming proportional odds holds, this would give a general rough idea of the treatment 0 and 1 trajectories over weeks (0 24 36 etc). I know probabilities are the correct ways to approach this but my boss wants to see the two odds ratios lines
                // any advice more easily to extract these values from m1 or m2, besides lincom for each of the 7 time points individually and construct such a figure

                // Q9 I am not sure if I should use the computations like Q2 of m1 and Q4 of m2 for such a plot or use Q3 and Q6

                ************************************************** ***********************
                *** Step 5 random-intercept and random-slope ordinal longitudinal model
                ************************************************** ***********************

                // I really want to use random intercept and random slope but
                // get error message: initial values not feasible
                // r(1400);
                // Q10 How to find better initial values

                meologit impso i.week##i.treatment || id: week, ///
                or covariance(unstructured)
                estimate store m3


                // In order to model random intercept and slope, use sqrt
                // but still get initial values not feasible
                // r(1400);
                // how to find better initial values

                generate weeksqrt = sqrt(week)

                meologit impso c.weeksqrt##i.treatment || id: weeksqrt, ///
                or covariance(unstructured)
                estimate store m4




                Your advice and guidance is greatly appreciated!

                Comment


                • #9
                  This is a lot of material to go through, and is probably more appropriately discussed with your advisor/supervisor than on this Forum. Having reviewed it lightly, I want to make just one important comment. It is clear from the results you show that the model using week as a continuous variable is not appropriate for your data. If it were, the coefficients of the i.week and i.week#treatment variables in the discrete week model would look like they also vary linearly with week--but clearly they do not. So I would not waste more time and effort interpreting the model with c.week. It's just a poor fit to your data. Focus on the i.week model.

                  With that in mind, I suggest that, after eliminating the c.week model, you break this lengthy post down into smaller, more digestible, pieces and post them in new threads. You might start one thread that's about interpreting the odds ratios in the i.week model, another on how best to graph the -margins- results, and yet another on the initial values not feasible problem with the random-slopes model.



                  Comment


                  • #10
                    Thanks much Professor Schechter. That's great suggestion!

                    Comment

                    Working...
                    X