Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in proportion adjusted for baseline characteristics (Mantel Haenszel method)

    Hi,

    I am analysing the results of a trial & would like to compare the proportion of individuals successfully treated between arms, adjusting for baseline differences (gender & age category). An example of my data is below.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(group outcome gender age agecat) byte t
    0 0 0 21 1 1
    0 0 0 26 1 1
    0 0 1 27 1 1
    0 1 0 18 0 1
    0 1 1 21 1 1
    1 0 1 20 0 1
    1 1 0 18 0 1
    1 1 0 30 1 1
    1 1 1 23 1 1
    1 1 1 24 1 1
    end
    label var outcome "=1 if success" 
    label var gender "=1 if female" 
    label var agecat "=1 if age >=21"
    A number of papers I have found in my field do this using Mantel Haenszel (M-H) proportion adjusted for the baseline stratum.

    I used the cs command to obtain the risk ratio applying the M-H weight using the following code, but have struggled to get the M-H adjusted risk difference.
    Code:
     cs outcome group, by(gender agecat)
    If I use a user-specified weight, I could use the M-H weights generated in the risk ratio command & apply these to the risk difference formula - however it strikes me that if this was the appropriate method, Stata would make this easier to run, & potentially the internal or external weights method is more appropriate?

    The code I used to generate M-H weights & run the cs risk difference command is as follows:
    Code:
    gen mhwgt=.
    forvalues i=0/1 {
        forvalues x=0/1 {
        
            tab outcome group if agecat==`i' & gender==`x', matcell(A)
            replace mhwgt = (A[2,1] * (A[1,2]+A[2,2])) / ( A[1,1]+A[1,2]+A[2,1]+A[2,2]) if agecat==`i' & gender==`x'
    }
    }
    cs outcome group, by(agecat gender) rd standard(mhwgt)
    I am not very familiar with matrices, so this may not be the most straightforward way to generate the weights (if this is even an appropriate method).

    Alternatively, I have considered using a different methodology, such as glm to fit a risk-difference model as in this post.
    Code:
    glm outcome group agecat gender, family(binomial) link(identity)
    Or a logit model followed by margins.
    Code:
    logit outcome i.group agecat gender
    margins group, pwcompare
    Any advice on how to estimate the difference in response rate using M-H proportions and adjusting for baseline strata or whether an alternative methodology may be better would be very warmly received!

    Best wishes,
    Bryony

  • #2
    -help mcc-

    Comment


    • #3
      I believe that you are looking for something that the attached ADO file does. It's a little ditty that I whipped up last year as an exercise when I ran across an article that briefly surveys the use of Mantel-Haenszel weights in estimating stratified risk differences. I intended to convert the Stata code to Mata code as a further exercise, but alas my attention drifted away to other diversions.

      Output from a test DO-file is shown below, which you can refer to for the command's syntax. (Its syntax is similar to analogous commands for stratified odds ratios in official Stata.) The test file uses the dataset from the literature article where I saw the approach described.

      Also shown below is an alternative that is brought up in the same literature article. It involves a common logistic regression model, followed by a contrast that the article's authors spent some time and effort implementing in SAS, but which is utterly trivial in Stata with its margins postestimation command.

      .ÿ
      .ÿversionÿ15.1

      .ÿ
      .ÿclearÿ*

      .ÿ
      .ÿ/*ÿDatasetÿfromÿR.ÿM.ÿGulickÿetÿal.,ÿMaravirocÿforÿpreviously
      >ÿÿÿÿtreatedÿpatientsÿwithÿR5ÿHIV-1ÿinfection.ÿNÿEnglÿJÿMedÿ359:1429-41,ÿ2008
      >ÿÿÿÿasÿreportedÿinÿM-M.ÿGe,ÿL.ÿK.ÿDurham,ÿR.ÿD.ÿMeyer,ÿW-A.ÿXieÿandÿN.ÿThomas,
      >ÿÿÿÿCovariate-adjustedÿdifferenceÿinÿproportionsÿfromÿclinicalÿtrialsÿusingÿ
      >ÿÿÿÿlogisticÿregressionÿandÿweightedÿriskÿdifferences.ÿDrugÿInfoÿJÿ45:481-93,ÿ
      >ÿÿÿÿ2011ÿ*/
      .ÿ
      .ÿinputÿbyte(rnaÿenfÿtrt)ÿdoubleÿprpÿintÿtot

      ÿÿÿÿÿÿÿÿÿÿrnaÿÿÿÿÿÿÿenfÿÿÿÿÿÿÿtrtÿÿÿÿÿÿÿÿÿprpÿÿÿÿÿÿÿtot
      ÿÿ1.ÿ0ÿ0ÿ0ÿ0.259ÿÿ27
      ÿÿ2.ÿ0ÿ0ÿ1ÿ0.589ÿÿ56
      ÿÿ3.ÿ0ÿ1ÿ0ÿ0.222ÿÿ45
      ÿÿ4.ÿ0ÿ1ÿ1ÿ0.566ÿÿ83
      ÿÿ5.ÿ1ÿ0ÿ0ÿ0.045ÿÿ22
      ÿÿ6.ÿ1ÿ0ÿ1ÿ0.255ÿÿ51
      ÿÿ7.ÿ1ÿ1ÿ0ÿ0.042ÿÿ24
      ÿÿ8.ÿ1ÿ1ÿ1ÿ0.356ÿÿ45
      ÿÿ9.ÿend

      .ÿ
      .ÿlabelÿdefineÿRNAÿ0ÿLowÿ1ÿHigh

      .ÿlabelÿvaluesÿrnaÿRNA

      .ÿlabelÿvariableÿrnaÿ"BaselineÿRNA"

      .ÿ
      .ÿlabelÿdefineÿNYÿ0ÿNoÿ1ÿYes

      .ÿlabelÿvaluesÿenfÿNY

      .ÿlabelÿvariableÿenfÿEnfuvirtide

      .ÿ
      .ÿlabelÿdefineÿGroupsÿ0ÿControlÿ1ÿMaraviroc

      .ÿlabelÿvaluesÿtrtÿGroups

      .ÿlabelÿvariableÿtrtÿ"TreatmentÿGroup"

      .ÿ
      .ÿgenerateÿbyteÿcount1ÿ=ÿround(prpÿ*ÿtot)

      .ÿgenerateÿbyteÿcount0ÿ=ÿround((1ÿ-ÿprp)ÿ*ÿtot)

      .ÿ
      .ÿassertÿtotÿ==ÿcount1ÿ+ÿcount0

      .ÿdropÿtotÿprp

      .ÿ
      .ÿquietlyÿreshapeÿlongÿcount,ÿi(rnaÿenfÿtrt)ÿj(rsp)

      .ÿlabelÿvariableÿcountÿCount

      .ÿlabelÿvaluesÿrspÿNY

      .ÿlabelÿvariableÿrspÿResponder

      .ÿ
      .ÿlist,ÿnoobsÿsepby(rnaÿenf)

      ÿÿ+--------------------------------------+
      ÿÿ|ÿÿrnaÿÿÿenfÿÿÿÿÿÿÿÿÿtrtÿÿÿrspÿÿÿcountÿ|
      ÿÿ|--------------------------------------|
      ÿÿ|ÿÿLowÿÿÿÿNoÿÿÿÿÿControlÿÿÿÿNoÿÿÿÿÿÿ20ÿ|
      ÿÿ|ÿÿLowÿÿÿÿNoÿÿÿÿÿControlÿÿÿYesÿÿÿÿÿÿÿ7ÿ|
      ÿÿ|ÿÿLowÿÿÿÿNoÿÿÿMaravirocÿÿÿÿNoÿÿÿÿÿÿ23ÿ|
      ÿÿ|ÿÿLowÿÿÿÿNoÿÿÿMaravirocÿÿÿYesÿÿÿÿÿÿ33ÿ|
      ÿÿ|--------------------------------------|
      ÿÿ|ÿÿLowÿÿÿYesÿÿÿÿÿControlÿÿÿÿNoÿÿÿÿÿÿ35ÿ|
      ÿÿ|ÿÿLowÿÿÿYesÿÿÿÿÿControlÿÿÿYesÿÿÿÿÿÿ10ÿ|
      ÿÿ|ÿÿLowÿÿÿYesÿÿÿMaravirocÿÿÿÿNoÿÿÿÿÿÿ36ÿ|
      ÿÿ|ÿÿLowÿÿÿYesÿÿÿMaravirocÿÿÿYesÿÿÿÿÿÿ47ÿ|
      ÿÿ|--------------------------------------|
      ÿÿ|ÿHighÿÿÿÿNoÿÿÿÿÿControlÿÿÿÿNoÿÿÿÿÿÿ21ÿ|
      ÿÿ|ÿHighÿÿÿÿNoÿÿÿÿÿControlÿÿÿYesÿÿÿÿÿÿÿ1ÿ|
      ÿÿ|ÿHighÿÿÿÿNoÿÿÿMaravirocÿÿÿÿNoÿÿÿÿÿÿ38ÿ|
      ÿÿ|ÿHighÿÿÿÿNoÿÿÿMaravirocÿÿÿYesÿÿÿÿÿÿ13ÿ|
      ÿÿ|--------------------------------------|
      ÿÿ|ÿHighÿÿÿYesÿÿÿÿÿControlÿÿÿÿNoÿÿÿÿÿÿ23ÿ|
      ÿÿ|ÿHighÿÿÿYesÿÿÿÿÿControlÿÿÿYesÿÿÿÿÿÿÿ1ÿ|
      ÿÿ|ÿHighÿÿÿYesÿÿÿMaravirocÿÿÿÿNoÿÿÿÿÿÿ29ÿ|
      ÿÿ|ÿHighÿÿÿYesÿÿÿMaravirocÿÿÿYesÿÿÿÿÿÿ16ÿ|
      ÿÿ+--------------------------------------+

      .ÿ
      .ÿ*
      .ÿ*ÿBeginÿhere
      .ÿ*
      .ÿegenÿbyteÿstrataÿ=ÿgroup(rnaÿenf)

      .ÿ
      .ÿ//ÿ"CMH"ÿinÿGeÿetÿal.ÿTableÿ3
      .ÿcmhrdÿrspÿtrtÿ[fweight=count],ÿstrata(strata)
      RDÿ=ÿ0.308
      95%ÿCI:ÿ[0.220,ÿ0.397]
      zÿ=ÿ6.817
      Probÿ>ÿ|z|ÿ=0.0000

      .ÿdisplayÿinÿsmclÿasÿtextÿ"Estimateÿ±ÿStandardÿError:ÿ"ÿ%05.3fÿr(d)ÿ"ÿ±ÿ"ÿ%05.3fÿsqrt(r(v))
      Estimateÿ±ÿStandardÿError:ÿ0.308ÿ±ÿ0.045

      .ÿ
      .ÿ//ÿ"LR-Discrete"ÿinÿGeÿetÿal.ÿTableÿ3
      .ÿlogitÿrspÿi.trtÿi.strataÿ[fweight=count],ÿnolog

      LogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ353
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿLRÿchi2(4)ÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿ60.15
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0000
      Logÿlikelihoodÿ=ÿ-201.10468ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿPseudoÿR2ÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.1301

      ------------------------------------------------------------------------------
      ÿÿÿÿÿÿÿÿÿrspÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
      -------------+----------------------------------------------------------------
      ÿÿÿÿÿÿÿÿÿtrtÿ|
      ÿÿMaravirocÿÿ|ÿÿÿ1.633945ÿÿÿ.2927252ÿÿÿÿÿ5.58ÿÿÿ0.000ÿÿÿÿÿ1.060215ÿÿÿÿ2.207676
      ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
      ÿÿÿÿÿÿstrataÿ|
      ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿ-.1234235ÿÿÿ.3018771ÿÿÿÿ-0.41ÿÿÿ0.683ÿÿÿÿ-.7150917ÿÿÿÿ.4682448
      ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿ-1.528496ÿÿÿ.3869158ÿÿÿÿ-3.95ÿÿÿ0.000ÿÿÿÿ-2.286837ÿÿÿ-.7701545
      ÿÿÿÿÿÿÿÿÿÿ4ÿÿ|ÿÿ-1.125569ÿÿÿ.3741997ÿÿÿÿ-3.01ÿÿÿ0.003ÿÿÿÿ-1.858986ÿÿÿ-.3921507
      ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
      ÿÿÿÿÿÿÿ_consÿ|ÿÿ-1.212743ÿÿÿ.3188119ÿÿÿÿ-3.80ÿÿÿ0.000ÿÿÿÿ-1.837603ÿÿÿ-.5878835
      ------------------------------------------------------------------------------

      .ÿmarginsÿ,ÿdydx(trt)

      AverageÿmarginalÿeffectsÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ353
      ModelÿVCEÿÿÿÿ:ÿOIM

      Expressionÿÿÿ:ÿPr(rsp),ÿpredict()
      dy/dxÿw.r.t.ÿ:ÿ1.trt

      ------------------------------------------------------------------------------
      ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿDelta-method
      ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿdy/dxÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
      -------------+----------------------------------------------------------------
      ÿÿÿÿÿÿÿÿÿtrtÿ|
      ÿÿMaravirocÿÿ|ÿÿÿÿ.306998ÿÿÿ.0452609ÿÿÿÿÿ6.78ÿÿÿ0.000ÿÿÿÿÿ.2182882ÿÿÿÿ.3957078
      ------------------------------------------------------------------------------
      Note:ÿdy/dxÿforÿfactorÿlevelsÿisÿtheÿdiscreteÿchangeÿfromÿtheÿbaseÿlevel.

      .ÿ
      .ÿexit

      endÿofÿdo-file


      .


      I never intended to send it up to SSC, and so the attached ADO file does not have an accompanying help file, but let me know if you have trouble with the syntax. (I recommend forgoing it in favor of the logistic regression approach, by the way.)
      Attached Files

      Comment


      • #4
        This is incredibly helpful - thank you so much for such a detailed response & the ADO file. I hope to convince my team the logistic regression if preferable!

        Comment

        Working...
        X