Inverse odds and inverse odds ratio weighting for causal mediation analysis

Kristoffer Johansen

Join Date: Nov 2022

Posts: 1
#1

Inverse odds and inverse odds ratio weighting for causal mediation analysis

16 Nov 2022, 03:14

Hello all.

I am currently working on a paper for my PhD and i am attempting to use causal mediation analysis using Inverse Odds Ratio weighting/Inverse odds weighting based on the work of Tchetgen Tchetgen (2013) Inverse odds ratio-weighted estimation for causal mediation analysis - PubMed (nih.gov) and am following the guideline of Ngyen et al. (2015) Practical guidance for conducting mediation analysis with multiple mediators using inverse odds ratio weighting - PubMed (nih.gov). In short, the procedure uses a regression model (any type of GLM or time to event model can be applied) for the exposure (i.e. X = M1, M2, Z1, Z2... etc) and calculates either the inverse odds or the inverse odds ratio which are used in subsequent regression models for the outcome to compute total, direct and indirect effects, and standard errors and confidence intervals are given by bootstrapping. At least this is how i have interpreted the procedure.

The problem i have encountered is that the exposure of interest is a polytomous variable with 4 levels of the exposure, and the example given in the paper of Ngyen et al. is based on a binary exposure. Thus, for the examples the inverse odds and inverse odds ratio weights is calculated by use of logistic regression. As such i am required to modify the code as i need to apply polytomous logistic regression (which the paper explicity states is doable) to compute the weight. After several weeks of frustration i was able to modify the code to get output for all exposure levels of interest. However, the inverse odds weighting procedure gives a wildly different result compared with the inverse odds ratio weighting procedure. As my statistical background is quite light and my programming experience is even lighter, I have not been able to comprehend which procedure is correct (if any), and therefore i thought i would post my code on this forum to see if anyone can see any potential mistakes in the code. I have made a reproducible example for the two procedures which is given below:

***************sample dataset**************
webuse cancer

************************************************** *****************************************IOW****** ************************************************** ************************************************** *******

****Step 1 mediation - Prepare data (start program)****
capture program drop IOW
program IOW, rclass
capture drop predprob1 predprob2 predprob3 inverseodds2 inverseodds3 wt_iow

****Step 2 - Regress exposure on mediators and covariates****
mlogit drug c.age, base(1)

****Step 3 - Extract predicted probabilities for all 3 exposure levels****
predict predprob1 predprob2 predprob3, pr

****Step 4 - Calculate the inverse of the odds for categories 2 and 3****
gen inverseodds2 = ((1-predprob2)/predprob2)
gen inverseodds3 = ((1-predprob3)/predprob3)

****Step 5 - Generate the weights, first set value of reference group to 1, for the other groups: Use inverse of the odds for each respective exposure level****
gen wt_iow = 1 if drug==1
replace wt_iow = inverseodds2 if drug==2
replace wt_iow = inverseodds3 if drug==3

****Step 6 - estimate Total effect***
stset studytime, failure(died) scale(365.25)
stcox i.drug c.age

****Step 7 - Calculate the total effect: Matrix command obtains coefficients, scalar command generates total effect variable (remember to generate for each level of exposure (For example [1.2] indicates betacoefficient 1, level 2)). Command return scalar stores the variable into the program function****
matrix bb_total = e(b)
scalar b_total1 =bb_total[1,2]
return scalar b_total1=bb_total[1,2]
scalar b_total2 =bb_total[1,3]
return scalar b_total2=bb_total[1,3]

****Step 8 - estimate natural direct effect and natural indirect effect: Here the weights (wt_iow) is used with the regression model, for survival analysis, the weight needs to be entered with the stset command****
stset studytime [pweight=wt_iow], failure(died) scale(365.25)
stcox i.drug c.age

matrix bb_direct = e(b)
scalar b_direct1 =bb_direct[1,2]
return scalar b_direct1 = bb_direct[1,2]
return scalar b_indirect1 = b_total1-b_direct1
scalar b_direct2 =bb_direct[1,3]
return scalar b_direct2 = bb_direct[1,3]
return scalar b_indirect2 = b_total2-b_direct2

end

****Step 9 - Estimate standard errors: Bootstrapping is used to calculate SE and subsequent confidence intervals --> 1000 resamplings is standard****
bootstrap r(b_indirect1) r(b_direct1) r(b_total1) r(b_indirect2) r(b_direct2) r(b_total2), seed (32222) reps(10):IOW

****Step 10 - Generate output (eform exponentiates coefficients)****
estat bootstrap, all eform

drop predprob1 predprob2 predprob3 inverseodds2 inverseodds3 wt_iow _st _d _t _t0

clear

************************************************** *******************************************IORW*** ************************************************** ************************************************** **************************************
webuse cancer

****Step 1 mediation - Prepare data (start program)****
capture program drop IORW
program IORW, rclass
capture drop inverseoddsratio2 inverseoddsratio3 wt_iorw

****Step 2 - Regress exposure on mediators and covariates****
mlogit drug c.age, base(1)

*****Step 3 - Generate inverse odds ratios based on output from mlogit********************
gen inverseoddsratio2 = 1/(exp((0.0276505*age)))
gen inverseoddsratio3 = 1/(exp((-0.0498276*age)))

****Step 4 - Generate the weights, first set value of reference group to 1, for the other groups: Use inverse of the odds for each respective exposure level****
gen wt_iorw = 1 if drug==1
replace wt_iorw = inverseoddsratio2 if drug==2
replace wt_iorw = inverseoddsratio3 if drug==3

****Step 5 - estimate Total effect***
stset studytime, failure(died) scale(365.25)
stcox i.drug c.age

****Step 6 - Calculate the total effect: Matrix command obtains coefficients, scalar command generates total effect variable (remember to generate for each level of exposure (For example [1.2] indicates betacoefficient 1, level 2)). Command return scalar stores the variable into the program function****
matrix bb_total = e(b)
scalar b_total1 =bb_total[1,2]
return scalar b_total1=bb_total[1,2]
scalar b_total2 =bb_total[1,3]
return scalar b_total2=bb_total[1,3]

****Step 7 - estimate natural direct effect and natural indirect effect: Here the weights (wt_iow) is used with the regression model, for survival analysis, the weight needs to be entered with the stset command****
stset studytime [pweight=wt_iorw], failure(died) scale(365.25)
stcox i.drug c.age

matrix bb_direct = e(b)
scalar b_direct1 =bb_direct[1,2]
return scalar b_direct1 = bb_direct[1,2]
return scalar b_indirect1 = b_total1-b_direct1
scalar b_direct2 =bb_direct[1,3]
return scalar b_direct2 = bb_direct[1,3]
return scalar b_indirect2 = b_total2-b_direct2

end

****Step 8 - Estimate standard errors: Bootstrapping is used to calculate SE and subsequent confidence intervals --> 1000 resamplings is standard****
bootstrap r(b_indirect1) r(b_direct1) r(b_total1) r(b_indirect2) r(b_direct2) r(b_total2), seed (32222) reps(10):IORW

****Step 9 - Generate output (eform exponentiates coefficients)****
estat bootstrap, all eform

drop inverseoddsratio2 inverseoddsratio3 wt_iorw _st _d _t _t0

Hopefully someone is able to provide insight into the matter at hand. In advance thank you all.
Kind regards.
Kristoffer Johansen
Tags: None
Darina Peycheva

Join Date: Jul 2022

Posts: 20
#2

18 Mar 2023, 05:52

Hi Kristoffer,
Just wondered if you used multiple imputation with this approach. I'm using it and have an issue with running survival analysis using a weight that is affected by missingness (also discussed here: Survival analysis with inverse probability weighting after multiple imputation - Statalist)

Thanks,
Darina
Comment

Announcement

Inverse odds and inverse odds ratio weighting for causal mediation analysis

Comment