Difference in proportion adjusted for baseline characteristics (Mantel Haenszel method)

Bryony Simmons

Join Date: Jan 2018

Posts: 37
#1

Difference in proportion adjusted for baseline characteristics (Mantel Haenszel method)

23 Apr 2019, 16:23

Hi,

I am analysing the results of a trial & would like to compare the proportion of individuals successfully treated between arms, adjusting for baseline differences (gender & age category). An example of my data is below.

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input float(group outcome gender age agecat) byte t 0 0 0 21 1 1 0 0 0 26 1 1 0 0 1 27 1 1 0 1 0 18 0 1 0 1 1 21 1 1 1 0 1 20 0 1 1 1 0 18 0 1 1 1 0 30 1 1 1 1 1 23 1 1 1 1 1 24 1 1 end label var outcome "=1 if success" label var gender "=1 if female" label var agecat "=1 if age >=21"

A number of papers I have found in my field do this using Mantel Haenszel (M-H) proportion adjusted for the baseline stratum.

I used the cs command to obtain the risk ratio applying the M-H weight using the following code, but have struggled to get the M-H adjusted risk difference.

Code:

cs outcome group, by(gender agecat)

If I use a user-specified weight, I could use the M-H weights generated in the risk ratio command & apply these to the risk difference formula - however it strikes me that if this was the appropriate method, Stata would make this easier to run, & potentially the internal or external weights method is more appropriate?

The code I used to generate M-H weights & run the cs risk difference command is as follows:

Code:

gen mhwgt=. forvalues i=0/1 { forvalues x=0/1 { tab outcome group if agecat==`i' & gender==`x', matcell(A) replace mhwgt = (A[2,1] * (A[1,2]+A[2,2])) / ( A[1,1]+A[1,2]+A[2,1]+A[2,2]) if agecat==`i' & gender==`x' } } cs outcome group, by(agecat gender) rd standard(mhwgt)

I am not very familiar with matrices, so this may not be the most straightforward way to generate the weights (if this is even an appropriate method).

Alternatively, I have considered using a different methodology, such as glm to fit a risk-difference model as in this post.

Code:

glm outcome group agecat gender, family(binomial) link(identity)

Or a logit model followed by margins.

Code:

logit outcome i.group agecat gender margins group, pwcompare

Any advice on how to estimate the difference in response rate using M-H proportions and adjusting for baseline strata or whether an alternative methodology may be better would be very warmly received!

Best wishes,
Bryony
Tags: GLM, mantel haenszel, proportions, risk difference
Clyde Schechter

Join Date: Apr 2014

Posts: 29948
#2

23 Apr 2019, 17:36

-help mcc-
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4373
#3

24 Apr 2019, 06:18

I believe that you are looking for something that the attached ADO file does. It's a little ditty that I whipped up last year as an exercise when I ran across an article that briefly surveys the use of Mantel-Haenszel weights in estimating stratified risk differences. I intended to convert the Stata code to Mata code as a further exercise, but alas my attention drifted away to other diversions.

Output from a test DO-file is shown below, which you can refer to for the command's syntax. (Its syntax is similar to analogous commands for stratified odds ratios in official Stata.) The test file uses the dataset from the literature article where I saw the approach described.

Also shown below is an alternative that is brought up in the same literature article. It involves a common logistic regression model, followed by a contrast that the article's authors spent some time and effort implementing in SAS, but which is utterly trivial in Stata with its margins postestimation command.

.ÿ
.ÿversionÿ15.1

.ÿ
.ÿclearÿ*

.ÿ
.ÿ/*ÿDatasetÿfromÿR.ÿM.ÿGulickÿetÿal.,ÿMaravirocÿforÿpreviously
>ÿÿÿÿtreatedÿpatientsÿwithÿR5ÿHIV-1ÿinfection.ÿNÿEnglÿJÿMedÿ359:1429-41,ÿ2008
>ÿÿÿÿasÿreportedÿinÿM-M.ÿGe,ÿL.ÿK.ÿDurham,ÿR.ÿD.ÿMeyer,ÿW-A.ÿXieÿandÿN.ÿThomas,
>ÿÿÿÿCovariate-adjustedÿdifferenceÿinÿproportionsÿfromÿclinicalÿtrialsÿusingÿ
>ÿÿÿÿlogisticÿregressionÿandÿweightedÿriskÿdifferences.ÿDrugÿInfoÿJÿ45:481-93,ÿ
>ÿÿÿÿ2011ÿ*/
.ÿ
.ÿinputÿbyte(rnaÿenfÿtrt)ÿdoubleÿprpÿintÿtot

ÿÿÿÿÿÿÿÿÿÿrnaÿÿÿÿÿÿÿenfÿÿÿÿÿÿÿtrtÿÿÿÿÿÿÿÿÿprpÿÿÿÿÿÿÿtot
ÿÿ1.ÿ0ÿ0ÿ0ÿ0.259ÿÿ27
ÿÿ2.ÿ0ÿ0ÿ1ÿ0.589ÿÿ56
ÿÿ3.ÿ0ÿ1ÿ0ÿ0.222ÿÿ45
ÿÿ4.ÿ0ÿ1ÿ1ÿ0.566ÿÿ83
ÿÿ5.ÿ1ÿ0ÿ0ÿ0.045ÿÿ22
ÿÿ6.ÿ1ÿ0ÿ1ÿ0.255ÿÿ51
ÿÿ7.ÿ1ÿ1ÿ0ÿ0.042ÿÿ24
ÿÿ8.ÿ1ÿ1ÿ1ÿ0.356ÿÿ45
ÿÿ9.ÿend

.ÿ
.ÿlabelÿdefineÿRNAÿ0ÿLowÿ1ÿHigh

.ÿlabelÿvaluesÿrnaÿRNA

.ÿlabelÿvariableÿrnaÿ"BaselineÿRNA"

.ÿ
.ÿlabelÿdefineÿNYÿ0ÿNoÿ1ÿYes

.ÿlabelÿvaluesÿenfÿNY

.ÿlabelÿvariableÿenfÿEnfuvirtide

.ÿ
.ÿlabelÿdefineÿGroupsÿ0ÿControlÿ1ÿMaraviroc

.ÿlabelÿvaluesÿtrtÿGroups

.ÿlabelÿvariableÿtrtÿ"TreatmentÿGroup"

.ÿ
.ÿgenerateÿbyteÿcount1ÿ=ÿround(prpÿ*ÿtot)

.ÿgenerateÿbyteÿcount0ÿ=ÿround((1ÿ-ÿprp)ÿ*ÿtot)

.ÿ
.ÿassertÿtotÿ==ÿcount1ÿ+ÿcount0

.ÿdropÿtotÿprp

.ÿ
.ÿquietlyÿreshapeÿlongÿcount,ÿi(rnaÿenfÿtrt)ÿj(rsp)

.ÿlabelÿvariableÿcountÿCount

.ÿlabelÿvaluesÿrspÿNY

.ÿlabelÿvariableÿrspÿResponder

.ÿ
.ÿlist,ÿnoobsÿsepby(rnaÿenf)

ÿÿ+--------------------------------------+
ÿÿ|ÿÿrnaÿÿÿenfÿÿÿÿÿÿÿÿÿtrtÿÿÿrspÿÿÿcountÿ|
ÿÿ|--------------------------------------|
ÿÿ|ÿÿLowÿÿÿÿNoÿÿÿÿÿControlÿÿÿÿNoÿÿÿÿÿÿ20ÿ|
ÿÿ|ÿÿLowÿÿÿÿNoÿÿÿÿÿControlÿÿÿYesÿÿÿÿÿÿÿ7ÿ|
ÿÿ|ÿÿLowÿÿÿÿNoÿÿÿMaravirocÿÿÿÿNoÿÿÿÿÿÿ23ÿ|
ÿÿ|ÿÿLowÿÿÿÿNoÿÿÿMaravirocÿÿÿYesÿÿÿÿÿÿ33ÿ|
ÿÿ|--------------------------------------|
ÿÿ|ÿÿLowÿÿÿYesÿÿÿÿÿControlÿÿÿÿNoÿÿÿÿÿÿ35ÿ|
ÿÿ|ÿÿLowÿÿÿYesÿÿÿÿÿControlÿÿÿYesÿÿÿÿÿÿ10ÿ|
ÿÿ|ÿÿLowÿÿÿYesÿÿÿMaravirocÿÿÿÿNoÿÿÿÿÿÿ36ÿ|
ÿÿ|ÿÿLowÿÿÿYesÿÿÿMaravirocÿÿÿYesÿÿÿÿÿÿ47ÿ|
ÿÿ|--------------------------------------|
ÿÿ|ÿHighÿÿÿÿNoÿÿÿÿÿControlÿÿÿÿNoÿÿÿÿÿÿ21ÿ|
ÿÿ|ÿHighÿÿÿÿNoÿÿÿÿÿControlÿÿÿYesÿÿÿÿÿÿÿ1ÿ|
ÿÿ|ÿHighÿÿÿÿNoÿÿÿMaravirocÿÿÿÿNoÿÿÿÿÿÿ38ÿ|
ÿÿ|ÿHighÿÿÿÿNoÿÿÿMaravirocÿÿÿYesÿÿÿÿÿÿ13ÿ|
ÿÿ|--------------------------------------|
ÿÿ|ÿHighÿÿÿYesÿÿÿÿÿControlÿÿÿÿNoÿÿÿÿÿÿ23ÿ|
ÿÿ|ÿHighÿÿÿYesÿÿÿÿÿControlÿÿÿYesÿÿÿÿÿÿÿ1ÿ|
ÿÿ|ÿHighÿÿÿYesÿÿÿMaravirocÿÿÿÿNoÿÿÿÿÿÿ29ÿ|
ÿÿ|ÿHighÿÿÿYesÿÿÿMaravirocÿÿÿYesÿÿÿÿÿÿ16ÿ|
ÿÿ+--------------------------------------+

.ÿ
.ÿ*
.ÿ*ÿBeginÿhere
.ÿ*
.ÿegenÿbyteÿstrataÿ=ÿgroup(rnaÿenf)

.ÿ
.ÿ//ÿ"CMH"ÿinÿGeÿetÿal.ÿTableÿ3
.ÿcmhrdÿrspÿtrtÿ[fweight=count],ÿstrata(strata)
RDÿ=ÿ0.308
95%ÿCI:ÿ[0.220,ÿ0.397]
zÿ=ÿ6.817
Probÿ>ÿ|z|ÿ=0.0000

.ÿdisplayÿinÿsmclÿasÿtextÿ"Estimateÿ±ÿStandardÿError:ÿ"ÿ%05.3fÿr(d)ÿ"ÿ±ÿ"ÿ%05.3fÿsqrt(r(v))
Estimateÿ±ÿStandardÿError:ÿ0.308ÿ±ÿ0.045

.ÿ
.ÿ//ÿ"LR-Discrete"ÿinÿGeÿetÿal.ÿTableÿ3
.ÿlogitÿrspÿi.trtÿi.strataÿ[fweight=count],ÿnolog

LogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ353
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿLRÿchi2(4)ÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿ60.15
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0000
Logÿlikelihoodÿ=ÿ-201.10468ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿPseudoÿR2ÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.1301

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿrspÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿtrtÿ|
ÿÿMaravirocÿÿ|ÿÿÿ1.633945ÿÿÿ.2927252ÿÿÿÿÿ5.58ÿÿÿ0.000ÿÿÿÿÿ1.060215ÿÿÿÿ2.207676
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿstrataÿ|
ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿ-.1234235ÿÿÿ.3018771ÿÿÿÿ-0.41ÿÿÿ0.683ÿÿÿÿ-.7150917ÿÿÿÿ.4682448
ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿ-1.528496ÿÿÿ.3869158ÿÿÿÿ-3.95ÿÿÿ0.000ÿÿÿÿ-2.286837ÿÿÿ-.7701545
ÿÿÿÿÿÿÿÿÿÿ4ÿÿ|ÿÿ-1.125569ÿÿÿ.3741997ÿÿÿÿ-3.01ÿÿÿ0.003ÿÿÿÿ-1.858986ÿÿÿ-.3921507
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-1.212743ÿÿÿ.3188119ÿÿÿÿ-3.80ÿÿÿ0.000ÿÿÿÿ-1.837603ÿÿÿ-.5878835
------------------------------------------------------------------------------

.ÿmarginsÿ,ÿdydx(trt)

AverageÿmarginalÿeffectsÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ353
ModelÿVCEÿÿÿÿ:ÿOIM

Expressionÿÿÿ:ÿPr(rsp),ÿpredict()
dy/dxÿw.r.t.ÿ:ÿ1.trt

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿDelta-method
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿdy/dxÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿtrtÿ|
ÿÿMaravirocÿÿ|ÿÿÿÿ.306998ÿÿÿ.0452609ÿÿÿÿÿ6.78ÿÿÿ0.000ÿÿÿÿÿ.2182882ÿÿÿÿ.3957078
------------------------------------------------------------------------------
Note:ÿdy/dxÿforÿfactorÿlevelsÿisÿtheÿdiscreteÿchangeÿfromÿtheÿbaseÿlevel.

.ÿ
.ÿexit

endÿofÿdo-file

.

I never intended to send it up to SSC, and so the attached ADO file does not have an accompanying help file, but let me know if you have trouble with the syntax. (I recommend forgoing it in favor of the logistic regression approach, by the way.)
Attached Files

cmhrd.ado (3.5 KB, 1 view)
Comment
Bryony Simmons

Join Date: Jan 2018

Posts: 37
#4

24 Apr 2019, 09:35

This is incredibly helpful - thank you so much for such a detailed response & the ADO file. I hope to convince my team the logistic regression if preferable!
Comment

Announcement

Difference in proportion adjusted for baseline characteristics (Mantel Haenszel method)

Comment

Comment

Comment