Odds ratios and Interaction effects Interpretation

Ryan Hammoud

Join Date: May 2018
Posts: 13

Odds ratios and Interaction effects Interpretation

21 Aug 2019, 10:38

Hi All,

I have having some trouble interpreting some interaction effects of some panel data ordinal regression models I ran.

I am using STATA/SE 15.

I ran dataex for my variables of interest (shown below) as an example of my data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float newid int assessmentnumber float(nature selfdiag lonely)
1  0 0 2 .
1  2 0 2 3
1  3 0 2 1
1  4 1 2 1
1  5 0 2 3
1  6 1 2 1
1  8 0 2 2
1 11 0 2 3
4  0 0 1 .
8  0 0 1 .
8  3 1 1 5
8  5 1 1 5
8  7 1 1 5
8  8 1 1 5
8  9 0 1 5
8 13 1 1 1
8 21 1 1 5
8 23 0 1 5
8 36 0 1 2
3  0 0 2 .
3  1 1 2 4
3  2 0 2 3
3  5 1 2 5
9  0 0 2 .
9  1 1 2 3
9  9 1 2 5
9 22 1 2 5
5  0 0 1 .
7  0 0 2 .
7  3 1 2 2
7  4 1 2 5
7  5 0 2 3
6  0 0 2 .
2  0 0 2 .
end
label values selfdiag noyes
label def noyes 1 "No", modify
label def noyes 2 "Yes", modify

My data is panel data, with multiple participants responding to multiple assessments.

I'm trying to explore the interaction effect between diagnosis (selfdiag: yes or no *answered once by participants*) and loneliness (lonely: ordered likert 1-5 *answered at every timepoint*) and exposure to nature (nature: yes no *answered at every timepoint*).

I ran the following xtologit command to explore my query but this is where I need a bit of help with interpretation of the odds ratios.

Code:

xtologit lonely i.selfdiag##i.nature, or

Fitting comparison model:

Iteration 0:   log likelihood = -11940.828  
Iteration 1:   log likelihood = -11853.136  
Iteration 2:   log likelihood = -11853.023  
Iteration 3:   log likelihood = -11853.023  

Refining starting values:

Grid node 0:   log likelihood = -9989.8616

Fitting full model:

Iteration 0:   log likelihood = -9989.8616  
Iteration 1:   log likelihood = -9574.7404  
Iteration 2:   log likelihood = -9532.0663  
Iteration 3:   log likelihood = -9526.9694  
Iteration 4:   log likelihood = -9526.7409  
Iteration 5:   log likelihood = -9526.7408  

Random-effects ordered logistic regression      Number of obs     =      9,575
Group variable: newid                           Number of groups  =        339

Random effects u_i ~ Gaussian                   Obs per group:
                                                              min =         21
                                                              avg =       28.2
                                                              max =         42

Integration method: mvaghermite                 Integration pts.  =         12

                                                Wald chi2(3)      =      56.33
Log likelihood  = -9526.7408                    Prob > chi2       =     0.0000

---------------------------------------------------------------------------------
         lonely | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
                |
       selfdiag |
           Yes  |   1.558753   .4191598     1.65   0.099     .9202021    2.640411
       1.nature |   .6323547   .0426758    -6.79   0.000     .5540075    .7217816
                |
selfdiag#nature |
         Yes#1  |   1.301939   .1564135     2.20   0.028     1.028794    1.647605
----------------+----------------------------------------------------------------
          /cut1 |   .0264557   .1458407                     -.2593867    .3122982
          /cut2 |   1.834789   .1471281                      1.546424    2.123155
          /cut3 |   3.232257   .1503294                      2.937617    3.526897
          /cut4 |   5.114032   .1645409                      4.791537    5.436526
----------------+----------------------------------------------------------------
      /sigma2_u |   4.156539   .3770802                      3.479454    4.965382
---------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation.
LR test vs. ologit model: chibar2(01) = 4652.56       Prob >= chibar2 = 0.0000

The main effect of a diagnosis here is non-significant, and the main effect of nature is significant.
I can see that there is a significant interaction effect, but what does this mean?

Having a diagnosis (yes) and exposure to nature (1) increases the odds of higher loneliness by 1.30 times? Compared to what? Compared to having no diagnosis (no) and no exposure to nature (0)?

Am I correct in thinking that compared to no diagnosis/no nature:

Diagnosis/no nature = 1.55 times increased odds of higher loneliness
Diagnosis/nature = 1.30 times increased odds of higher loneliness
No diagnosis/no nature = reference
No diagnosis/nature = 0.63 times decreased odds of higher loneliness

Would this mean that people with diagnosis are at increased odds of higher lonely scores, but these odds are reduced in contact with nature??

Thanks for any help you can provide!

Kind regards,
Ryan

Tags: None

Ryan Hammoud

Join Date: May 2018

Posts: 13
#2

27 Aug 2019, 06:24

Hi Everyone,

I was wondering if anyone had a chance to read through my post and has any advice on this?
I am just hoping to figure out how to interpret this so that I can understand interactions/odds ratios for the future!

Kind regards,
Ryan
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#3

27 Aug 2019, 06:47

The main effect of a diagnosis here is non-significant, and the main effect of nature is significant.
I can see that there is a significant interaction effect, but what does this mean?

None of this means anything. The concept of statistical significance is a snake pit to start with, and the American Statistical Association now recommends it be abandoned altogether. See https://www.tandfonline.com/doi/full...5.2019.1583913. But even if you want to keep using the concept, the statistical significance of these coefficients by themselves has no tangible interpretation whatsoever, never did.

Having a diagnosis (yes) and exposure to nature (1) increases the odds of higher loneliness by 1.30 times? Compared to what? Compared to having no diagnosis (no) and no exposure to nature (0)?

No. That term in the output, which is a ratio of odds ratios, not an odds ratio, has no simple tangible interpretation.

Am I correct in thinking that compared to no diagnosis/no nature:
Diagnosis/no nature = 1.55 times increased odds of higher loneliness

Diagnosis/nature = 1.30 times increased odds of higher loneliness

No diagnosis/no nature = reference

No diagnosis/nature = 0.63 times decreased odds of higher loneliness

Well, I'm not sure what your / notation is supposed to mean here. But those with a diagnosis and no nature have 1.55 times as great an odds of higher loneliness as those with neither diagnosis nor nature. When we look at those with both diagnosis and nature exposures, they have 1.3 times as great an odds of higher loneliness as those with only one of those. I think the way to look at this, however, is by calculating the odds ratios of higher loneliness in each of the four combinations:

No diagnosis and no nature: reference group
Diagnosis and no nature: Odds ratio = 1.56 compared to reference
Nature and no diagnosis: Odds ratio = 0.63 compared to reference
Nature and diagnosis: Odds ratio = 1.56*0.63*1.30 (= 1.28)

Would this mean that people with diagnosis are at increased odds of higher lonely scores, but these odds are reduced in contact with nature??

So compared to those with neither exposure, those with diagnosis have increased odds of higher lonely scores, by a factor of 1.56. If they are also exposed to nature, then that factor decreases to 1.28--which means they are still at increased odds of higher lonely scores, but not by quite as much.

You might find it easier to understand these results if you look at predicted probabilities of each level of loneliness in all four groups. Odds ratios are not intuitive to people until they work with them for a long time--and even then some people fail to grasp them clearly. Probabilities are more easily understood. Try running:

[code]
margins selfdiag#nature
[code]
4 likes
Comment
Jian Wang

Join Date: Aug 2023

Posts: 4
#4

16 Aug 2023, 12:26

I am new to stata and just came across this trail of posts and find it very helpful. Two follow-up questions to professor Schechter 1) Nature and diagnosis: Odds ratio = 1.56*0.63*1.30 (= 1.28), is odds ratio 1.28 compared to reference, like the previous two rows (OR 1.56 and OR 0.63)? 2) Is there a stata command to output directly 1.28?; 3) What is the meaning of
selfdiag#nature | Yes#1 | 1.301939
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#5

16 Aug 2023, 13:10

1) Nature and diagnosis: Odds ratio = 1.56*0.63*1.30 (= 1.28), is odds ratio 1.28 compared to reference, like the previous two rows (OR 1.56 and OR 0.63)?

Yes, it is relative to the joint reference category of nature == 0 and diagnosis == "No".

2) Is there a stata command to output directly 1.28?

Code:

lincom 1.selfdiag + 1.nature + 1.selfdiag#1.nature, or

In fact, this gives you not just the odds ratio of 1.28 but also its standard error, z-statistic, p-value, and confidence interval.

3) What is the meaning of
selfdiag#nature | Yes#1 | 1.301939

This is the interaction effect in the odds ratio metric. In a linear regression, a term like selfdiag#nature would be the difference between the effect of nature when selfdiag = No and selfdiag = Yes. Note that the effect of nature (conditional on some specified value of selfdiag) is itself a difference in outcome between the expected outcome when nature = 1 and the expected outcome when nature = 0. So in a linear model, this coefficient is a difference-in-differences. Analogous to that, in a logistic regression, given that odds ratios are multiplicative, not additive effects, this result is the ratio of odds ratios (sometimes abbreviated ROR).

Now, odds ratios are difficult for many people to understand (and many people who think they understand them actually don't.) Ratios of odds ratios can be mind-boggling. So, for this reason, among others, in logistic models with interactions we often don't report the ratio of odds ratios. Instead we might go back to the probability metric of the outcome and report a difference in differences. That cannot be gleaned from the -logistic- output itself: to get that one would have to use the -margins- command. You will see it done both ways.
1 like
Comment
Jian Wang

Join Date: Aug 2023

Posts: 4
#6

17 Aug 2023, 08:53

Thanks so much Professor Schechter for explaining this to me.

selfdiag#nature | Yes#1 | 1.301939, it matters which term is placed first? difference of effect of nature when selfdiag = No and Yes. Would nature#selfdiag be difference of effect of selfdiag when nature = No and Yes?

How do we know the reference of selfdiag and nature both No, how would this situation relate to loneliness?

Last edited by Jian Wang; 17 Aug 2023, 09:31.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#7

17 Aug 2023, 09:17

Would nature#selfdiag be difference of effect of selfdiag when nature = No and Yes?

No, there is no substantive difference between nature#selfdiag and selfdiag#nature. And, written either way, it shows the ratio of odds ratios for nature when selfdiag = 1 vs selfdiag = 0. It also shows the ratio of odds ratios for selfdiag when nature = 1 vs nature = 0. These two ratios of odds ratios are always the same.
1 like
Comment
Jian Wang

Join Date: Aug 2023

Posts: 4
#8

17 Aug 2023, 13:47

Professor Schechter, I greatly appreciate your answers above. This is very helpful for my current project. I have found a public dataset just like my research dataset and I'd like to use it to ask a few further questions, along the line of above discussions. I will post my questions here but if you think I should start a new post, I will do that. Here are the questions. The codes run in Stata/MP 18.0 and my questions are labelled as Q1, Q2 etc.

*** Antpsychotic drugs effective for patients with schizophrenia?

use https://www.stata-press.com/data/mlmus4/schiz, clear

dataex

/*
. dataex

----------------------- copy starting from the next line -----------------------
[CODE]
* Example generated by -dataex-. For more info, type help dataex
clear
input float(id imps week treatment sex)
1103 5.5 0 1 1
1103 3 1 1 1
1103 2.5 3 1 1
1103 4 6 1 1
1104 6 0 1 1
1104 3 1 1 1
1104 1.5 3 1 1
1104 2.5 6 1 1
1105 4 0 1 1
1105 3 1 1 1
1105 1 3 1 1
1106 3 0 1 1

*/

set seed 123

generate impso = imps
recode impso -9=. 1/2.4=1 2.5/4.4=2 4.5/5.4=3 5.5/7=4
recode week 0=0 1=24 2=36 3=52 4=64 5=76 6=90

xtset id week
xtdescribe if !missing(impso)
table week treatment, statistic(count impso)

************************************************** *****************************************
**** Step 1: random-intercept ordinal longitudinal model, week as a continuous variable
************************************************** *****************************************

meologit impso c.week##i.treatment || id: , ///
or covariance(unstructured)
estimate store m1

/*

Wald chi2(3) = 468.14
Log likelihood = -1718.1417 Prob > chi2 = 0.0000
----------------------------------------------------------------------------------
impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
-----------------+----------------------------------------------------------------
week | .979565 .0034883 -5.80 0.000 .9727518 .9864259
1.treatment | .7116673 .2084615 -1.16 0.246 .4008143 1.263604
|
treatment#c.week |
1 | .9696156 .0040077 -7.47 0.000 .9617925 .9775024
-----------------+----------------------------------------------------------------
/cut1 | -5.647717 .3144359 -6.264 -5.031434
/cut2 | -2.620962 .2710405 -3.152191 -2.089732
/cut3 | -.580448 .2575332 -1.085204 -.0756922
-----------------+----------------------------------------------------------------
id |
var(_cons)| 3.56976 .4449255 2.796066 4.55754
----------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation to odds ratios.
LR test vs. ologit model: chibar2(01) = 336.44 Prob >= chibar2 = 0.0000

*/

** week as continuous variable, but the unit = one week
// I care about the odds ratio output at 52, 72 and 90 weeks, using the lincom
lincom 52* 1.treatment#c.week, or

/*

------------------------------------------------------------------------------
impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
(1) | .2009923 .043199 -7.47 0.000 .131896 .306286
------------------------------------------------------------------------------

*/

// Q1 Is this correct calculation of odds ratio for week 52 above?
// Q2 correct? intepretation: For treated patients at week 52, the odds of higher serverity score vs lower score is 0.2 in comparsion to placebo patients at week 52

lincom week + 1.treatment + 52* 1.treatment#c.week, or

/*

------------------------------------------------------------------------------
impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
(1) | .1401166 .0368463 -7.47 0.000 .0836855 .2346006
------------------------------------------------------------------------------

*/

// Q3 correct? interpretation: For treated patients at week 52, the odds of higher serverity score vs lower score is 0.14 in comparsion of placebo patients at week 0

************************************************** *********************************
*** Step 2: random-intercept ordinal longitudinal model, week as a cateorical variable
************************************************** *********************************

** week as cateorical variable becasue the weeks are not evenly distributed, eg. 36 and 52, 72 and 90

meologit impso i.week##i.treatment || id: , ///
or covariance(unstructured)
estimate store m2

/*
Log likelihood = -1697.1182 Prob > chi2 = 0.0000
--------------------------------------------------------------------------------
impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
---------------+----------------------------------------------------------------
week |
24 | .4390883 .1339682 -2.70 0.007 .2414603 .798469
36 | 1.556353 2.171234 0.32 0.751 .1010685 23.96628
52 | .2893764 .0916327 -3.92 0.000 .1555697 .538271
64 | .2330608 .3900584 -0.87 0.384 .0087674 6.195394
76 | .1053475 .1789698 -1.32 0.185 .0037719 2.942319
90 | .1410499 .048347 -5.71 0.000 .0720462 .2761435
|
1.treatment | 1.004031 .3394933 0.01 0.991 .5175238 1.947889
|
week#treatment |
24 1 | .269669 .0945759 -3.74 0.000 .1356143 .5362369
36 1 | .0152256 .0244867 -2.60 0.009 .0006511 .3560601
52 1 | .1228219 .0450767 -5.71 0.000 .0598243 .2521586
64 1 | .0205406 .038285 -2.08 0.037 .0005322 .7927546
76 1 | .1436847 .2762295 -1.01 0.313 .0033189 6.220427
90 1 | .0529062 .0210895 -7.37 0.000 .0242215 .1155613
---------------+----------------------------------------------------------------
/cut1 | -5.864446 .3498227 -6.550086 -5.178806
/cut2 | -2.821536 .3098026 -3.428738 -2.214334
/cut3 | -.6987083 .2947749 -1.276456 -.1209601
---------------+----------------------------------------------------------------
id |
var(_cons)| 3.748761 .4645762 2.940357 4.779422
--------------------------------------------------------------------------------

*/

// Q4? week#treatment 52 .1228219, this value is very different from the Q1 I calculated above in step 1, 0.2, the value here is closer to Q3 calculation, very strange
// Q5 interpretation: For treated patients at week 52, the odds of higher serverity score vs lower score is 0.12 in comparsion of placebo patients at week 52. But the different values from continous variable model is troubling

lincom 1.treatment + 52.week#1.treatment, or
/*
------------------------------------------------------------------------------
impso | Odds ratio Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
(1) | .123317 .0424487 -6.08 0.000 .0628086 .2421181
------------------------------------------------------------------------------

*/

//Q6 Correct? Interpretation: For treated patients at week 52, the odds of higher serverity score vs lower score is 0.12 in comparsion of placebo patients at week 0

************************************************** *****************
*** Step 3 calculate probabilities using margins and use marginsplot
************************************************** *****************

margins treatment, at(week=(24 36 52 64 76 90))

// Q7 The plots are very busy, I'd like to put each impso level on subgraphs
marginsplot, xdimension(at(week)) allxlabels recast(line) ciopt(lcolor(gray)) ///
plot1opts(lcolor(green) lpattern(dash)) ///
plot2opts(lcolor(blue) lpattern(dash)) ///
plot3opts(lcolor(red) lpattern(dash)) ///
plot4opts(lcolor(green) lpattern(solid)) ///
plot5opts(lcolor(blue) lpattern(solid)) ///
plot6opts(lcolor(red) lpattern(solid))

************************************************** *********************
*** Step 4 plot odds ratios
************************************************** *********************

// Q8 I see people plot odds ratios, essentially two lines, y axis is the odds ratio, x axis is the weeks
// one line for treatment 0 another for treatment 1
// assuming proportional odds holds, this would give a general rough idea of the treatment 0 and 1 trajectories over weeks (0 24 36 etc). I know probabilities are the correct ways to approach this but my boss wants to see the two odds ratios lines
// any advice more easily to extract these values from m1 or m2, besides lincom for each of the 7 time points individually and construct such a figure

// Q9 I am not sure if I should use the computations like Q2 of m1 and Q4 of m2 for such a plot or use Q3 and Q6

************************************************** ***********************
*** Step 5 random-intercept and random-slope ordinal longitudinal model
************************************************** ***********************

// I really want to use random intercept and random slope but
// get error message: initial values not feasible
// r(1400);
// Q10 How to find better initial values

meologit impso i.week##i.treatment || id: week, ///
or covariance(unstructured)
estimate store m3

// In order to model random intercept and slope, use sqrt
// but still get initial values not feasible
// r(1400);
// how to find better initial values

generate weeksqrt = sqrt(week)

meologit impso c.weeksqrt##i.treatment || id: weeksqrt, ///
or covariance(unstructured)
estimate store m4

Your advice and guidance is greatly appreciated!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#9

17 Aug 2023, 14:18

This is a lot of material to go through, and is probably more appropriately discussed with your advisor/supervisor than on this Forum. Having reviewed it lightly, I want to make just one important comment. It is clear from the results you show that the model using week as a continuous variable is not appropriate for your data. If it were, the coefficients of the i.week and i.week#treatment variables in the discrete week model would look like they also vary linearly with week--but clearly they do not. So I would not waste more time and effort interpreting the model with c.week. It's just a poor fit to your data. Focus on the i.week model.

With that in mind, I suggest that, after eliminating the c.week model, you break this lengthy post down into smaller, more digestible, pieces and post them in new threads. You might start one thread that's about interpreting the odds ratios in the i.week model, another on how best to graph the -margins- results, and yet another on the initial values not feasible problem with the random-slopes model.
Comment
Jian Wang

Join Date: Aug 2023

Posts: 4
#10

17 Aug 2023, 17:09

Thanks much Professor Schechter. That's great suggestion!
Comment

Announcement

Odds ratios and Interaction effects Interpretation

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment