Graphing Odds Ratios Across Time

Ryan LeCount

Join Date: May 2015

Posts: 23
#1

Graphing Odds Ratios Across Time

25 Jun 2017, 22:46

Hello All and Thanks In Advance.

In spite of the fact that I gather that this is a relatively common practice, I've been unable to sort out the syntax for this procedure. Basically, I'd like to graphically represent change in odds ratios over time on a single graph for multiple DVs.

More specifically, I'm wondering, for example, how do various measures of happiness (binary variables) associate with a specific predictor of happiness (x1)- net of other relevant factors- over time. For what it is worth, I have ~ 20 years in the data set. Ideally, the odds ratios displayed would have notation indicating P values as well.

In this case, we might use three logit models:

logit happy1 x1 (key indicator) x2 x3 x4, or
logit happy2 x1 (key indicator) x2 x3 x4, or
logit happy3 x1 (key indicator) x2 x3 x4, or

So, I want to know what the odds are of a unit change in each (happy1 happy2 happy3) predicting a unit change in x1 (x1 and all happy DVs are binary), net of other IVs, over that 20 year span- with three lines (one for each DV) on the same graph.

Apologies if the above is less than clear- but I do very much appreciate any help sorting this out.
Tags: None

Maarten Buis

Join Date: Mar 2014
Posts: 3426

26 Jun 2017, 01:41

Code:

sysuse nlsw88, clear
logit married i.collgrad##c.age i.south i.race, or
predictnl ormarried = exp(_b[1.collgrad] + _b[1.collgrad#c.age]*age),        ///
    ci(lb_m ub_m)

logit never_married i.collgrad##c.age i.south i.race, or
predictnl ornever_married = exp(_b[1.collgrad] + _b[1.collgrad#c.age]*age) , ///
    ci(lb_n ub_n)

twoway rarea lb_m ub_m age , sort ||       ///
       rarea lb_n ub_n age , sort ||       ///
       line ormarried ornever_married age, ///
            sort legend(off) yline(1)

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Comment

Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#3

26 Jun 2017, 03:57

In general, when we select a logistic regression model, we want to get the effect size overall, not exactly across time. For example, mortality within 30 days after hospitalization.

Considering you wish to have ORs across time, I'm just wondering why not selecting a survival analysis model, where you'd get the HRs.

There, you surely get a "well-oiled machinery" to dovetail effect size across time.

Best regards,

Marcos
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1119
#4

26 Jun 2017, 09:23

Originally posted by Ryan LeCount View Post

More specifically, I'm wondering, for example, how do various measures of happiness (binary variables) associate with a specific predictor of happiness (x1)- net of other relevant factors- over time. For what it is worth, I have ~ 20 years in the data set. Ideally, the odds ratios displayed would have notation indicating P values as well.

In this case, we might use three logit models:

logit happy1 x1 (key indicator) x2 x3 x4, or
logit happy2 x1 (key indicator) x2 x3 x4, or
logit happy3 x1 (key indicator) x2 x3 x4, or

Hi Ryan. If I follow, happy1-happy3 are three distinct dichotomous measures of happiness, and the three logit models you show above are all estimated using the same annual sample of individuals. Is that right?

What's not clear to me is whether you have a completely new (and presumably independent) set of observations for each year, or whether some individuals appear in more than one of those annual samples. I think you need to clarify that before anyone can offer good advice about how to compute and plot the ORs.

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment
Ryan LeCount

Join Date: May 2015

Posts: 23
#5

26 Jun 2017, 11:27

Thanks all for your input and feedback and apologies for not having been clearer. I'll try to address the issues raised:

...the three logit models you show above are all estimated using the same annual sample of individuals. Is that right?

What's not clear to me is whether you have a completely new (and presumably independent) set of observations for each year, or whether some individuals appear in more than one of those annual samples. I think you need to clarify that before anyone can offer good advice about how to compute and plot the ORs.

These data are cross-sectional with new respondents for each year. The data are rigorously sampled and very much representative, so the objective is to make sense of how strongly these measures are associated with my key IV across time, net of other factors.

Considering you wish to have ORs across time, I'm just wondering why not selecting a survival analysis model, where you'd get the HRs.

Thanks for this suggestion. To be honest, I am not familiar with this technique, but my sense is that it is used to evaluate the duration/scope of a single event/treatment on a stable sample across time. My data are different and so is my question- these are all cross-sectionally derived attitudinal/social contextual variables and year to year variation is often directionally variable. I'm just interested in how each of these different measures - over the course of the whole period- are associated with my key IV net of all other factors.

Again thanks very much and I'd be happy to clarify further if it would be helpful. I'm deeply grateful for each of you who takes the time to address these questions.

Last edited by Ryan LeCount; 26 Jun 2017, 11:33.
Comment

Bruce Weaver

Join Date: May 2014
Posts: 1119

27 Jun 2017, 08:59

Hi Ryan. Getting a plot of ln(OR) with 95% CI across the years is quite straightforward via margins and marginsplot. E.g., for variable happy1, this should do it:

Code:

* Let variable yr = Year, and treat it as categorical
logit happy1 x1##yr x2 x3 x4, or
* Use r. contrast operator to get ln(OR) for x1.
margins r.x1@yr, vsquish predict(xb) contrast(nowald effects)
marginsplot

I think that in order to get a plot of the ORs with 95% CIs using this approach, you would have to write the contrast results from margins to a tempfile and compute the ORs and CIs yourself. The following is based on an example Clyde Schechter posted here. It also borrows from post #2 in this thread (by Maarten Buis).

Code:

* Create some fake data to illustrate
clear
set obs 1000
generate yr = mod(_n,20)
replace yr = 20 if yr==0
*sort yr
generate id = yr*1000+_n
generate happy1 = rbinomial(1,.3)
generate x1 = rbinomial(1,.7) // Ryan called it an "indicator"
generate x2 = rnormal(50, 10)
generate x3 = rnormal(50, 10)
generate x4 = rnormal(50, 10)
preserve // Preserve the original data set

*** The actual code begins here ***
logit happy1 x1##yr x2 x3 x4, or
margins yr#x1, vsquish predict(xb) // Get log-odds
marginsplot
graph export "C:\Temp\Happy1_logOdds.png", as(png) replace

* Use r. contrast operator to get ln(OR) for x1.
* Save -margins- table to a temporary file, and then
* compute and list the ORs with 95% CIs.

tempfile temp1
margins r.x1@yr, vsquish predict(xb) contrast(nowald effects) saving(`temp1')
marginsplot
graph export "C:\Temp\Happy1_logOR.png", ///
as(png) replace

use `temp1', clear  // Use the temp file
generate OR = exp(_margin)
generate OR_lb = exp(_ci_lb)
generate OR_ub = exp(_ci_ub)
generate Year = _m2 + 1994 // Replace 1994 with appropriate value
* Table shown below shows ORs with 95% CIs.
list Year _margin _ci_lb _ci_ub OR OR_lb OR_ub

twoway rcap OR_lb OR_ub Year , sort ||       ///
       connected OR Year, ///
            sort legend(off) yline(1)

restore // revert to original data set

I hope this helps.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)

Comment

Bruce Weaver

Join Date: May 2014
Posts: 1119

06 Jul 2017, 13:50

I was just reviewing some things in Michael Mitchell's nice book Interpreting and Visualizing Regression Models Using Stata, and was reminded of another way you could tackle this problem. Mitchell's approach does not require a temporary file--instead, he uses matrix commands to get access to the r() matrices generated by margins. Here is a variation on the approach he shows in his book.

Code:

* Create some fake data to illustrate
clear
set obs 1000
generate yr = mod(_n,20)
replace yr = 20 if yr==0
*sort yr
generate id = yr*1000+_n
generate happy1 = rbinomial(1,.3)
generate x1 = rbinomial(1,.7) // Ryan called it an "indicator"
generate x2 = rnormal(50, 10)
generate x3 = rnormal(50, 10)
generate x4 = rnormal(50, 10)
* End of fake data generation

* The real code begins here
logit happy1 x1##yr x2 x3 x4, or
margins r.x1@yr, vsquish predict(xb) contrast(nowald effects)

return list // Comment out or remove this line if you like
matrix table1 = r(table) // Transpose of original margins table
matrix table2 = table1'  // Transpose to get desired layout
matrix list table2 // Comment out or remove this line if you wish
* Now extract columnns 1, 5 & 6 from that table (B, Lower, Upper)
matrix table3 = ///
table2[1..rowsof(table2), 1], ///
table2[1..rowsof(table2), 5..6]
matrix list table3 // This shows B with 95% CI -- remove if you like
* Use mata to exponentiate the table3 macro to get ORs with CIs
mata : st_matrix("ORs", exp(st_matrix("table3")))
matrix colnames ORs = "OR" "Lower" "Upper"
matrix list ORs  // list the matrix of ORs with CIs

svmat ORs // Save ORs matrix to working dataset
rename (ORs1 ORs2 ORs3) (OR Lower Upper)
generate Year = _n + 1994 if !missing(OR) // change 1994 to correct value
list Year OR Lower Upper if !missing(Year)

twoway rcap Lower Upper Year || connected OR Year,  ///
       sort legend(off) yline(1) ytitle(Odds Ratio & 95% CI) ///
       xlabel(1995(5)2015) title(Odds Ratios Over Time)

HTH.

Attached Files

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)

Comment

Ryan LeCount

Join Date: May 2015

Posts: 23
#8

08 Jul 2017, 00:37

Many, many thanks for taking the time to lay all of this out. This is extraordinarily helpful!
Comment
Daniel Bogale Odo

Join Date: Sep 2020

Posts: 5
#9

20 Jan 2022, 21:26

Hi all,
I wanted to plot the odds ratio from logistic regression model (on the y-axis) over the continuous predictor variable (on the x-axis). Or, odds ratio of health outcome (presence/absence) over the level of air pollution (main predictor var). My data is cross-sectional, and I attached the types of graphs I want to make. I tried Marginsplot but that gives me the probability of the event rather than the odds ratio on the y-axis. My question could be elementary, but I have been struggling for more than 2 weeks. Examples of the graph I wanted to make found in this article (Fig 4 & 5): https://academic.oup.com/ije/article/48/4/1125/5487745
I appreciate any advice in advance.
Comment

Maarten Buis

Join Date: Mar 2014
Posts: 3426

#10

21 Jan 2022, 01:39

It looks like you are mixing up the odds and the odds ratio. The odds is a measure of how likely an event is; it is the expected number of successes per failure. The odds ratio is a measure of how this likelihood of an event differs across groups; it is the ratio of two odds. I am pretty sure you want the odds and not the odds ratio, so that is what I will give you. The graphs in the article you linked to look wrong to me, e.g. the curve starting at exactly 1 looks very suspicious to me.

Code:

// select the graph scheme
set scheme s1mono

// open example data
sysuse nlsw88, clear
// prepare the data

gen byte occat = cond(occupation < 3, 1,                    ///
                 cond(inlist(occupation,5, 6, 8, 9), 2, 3)) ///
                 if !missing(occupation)
label variable occat "occupational category"
label define occat 1 "white collar" ///
                   2 "skilled"      ///
                   3 "unskilled"
label value occat occat
gen byte urban = c_city + smsa
label define urban 2 "central city" ///
                   1 "suburban"     ///
                   0 "rural"
label value urban urban
label variable urban "urbanicity"

// estimate the model
logit union grade i.race i.south i.urban i.occat, or base

// get the predicted odds
margins, at(race=1 south=0 urban=1 occat=2 grade=(0/18)) expression(exp(xb()))

// make the graph
marginsplot, recastci(rarea) ciopts(astyle(ci)) plotopts(msymbol(i)) ///
    ytitle(predicted odds) xlab(0(5)15) ///
    title("The predicted odds of being in a union") ///
    subtitle("for white women, not from the south," ///
             "who live in a suburban area, and do skilled labor")

Click image for larger version

Name: Graph.png
Views: 1
Size: 71.1 KB
ID: 1646111

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Comment

Daniel Bogale Odo

Join Date: Sep 2020

Posts: 5
#11

23 Jan 2022, 17:15

Many thanks, Maarten Buis for the explanation with a very relevant example. This is very helpful!
Comment

Announcement