Propensity Score Matching prior to survival analysis in a cohort with an uncommon treatment and rare outcomes

Rebecca Palmer

Join Date: Oct 2015

Posts: 1
#1

Propensity Score Matching prior to survival analysis in a cohort with an uncommon treatment and rare outcomes

04 Feb 2016, 11:14

I am seeking advice for which propensity score matching technique should be used when preparing a matched dataset to compare survival when failure events are rare outcomes. I am working with data from a cancer registry and comparing cancer-specific survival between two treatment types (hormone vs. surgery) using a Cox Proportional Hazards model.

In my unmatched dataset, the groups unbalanced covariate distributions for age, year of diagnosis, marital status and tumour grade. Cohort size is n=164 (2.5%) in the hormone group and n=6178 (97.5%) in the surgery group. There are 5 failure events in the hormone group and 46 failure events in the surgery group. Cancer-specific death is a very rare outcome.

To adjust for the selection bias in treatment allocation to hormone therapy, I would like to use propensity score matching. I have tried doing this using the 'psmatch2' and various matching modalities. Ideally, I would like to use a fixed matching technique so that each treated patient has a set number of controls and the balance of characteristics in the unmatched and matched cohorts can be displayed clearly in a "table 1" format along with a statistical test of difference (t-test or chi-square as appropriate).

Please see below for my attempts to date. A big thanks in advance to those who have insight on how to approach this puzzle!

---Attempt #1: 1-to-1 nearest neighbour matching ---------------------
code: psmatch2 tx age year gr race neighbor(1) noreplacement

1:1 nearest neighbour matching creates a well-balanced cohort, but there are no failure events in my comparison (surgery group) and so I cannot use the matched cohort to model survival probabilities or hazard ratios.

--- Attempt #2: 1-to-n nearest neighbour matching with a caliper -------------------
code: psmatch2 tx age year gr race, neighbor(5) caliper(0.01)

1 to many nearest neighbour matching is the ideal scenario. I would like to have at least 5 comparison controls and use all of my treated group. But I need to increase the number of controls to 5 in order to have at least 1 failure outcome in the control group and at this point, the co-variates are no longer well-balanced between groups.

--- Attempt #3: Radius matching ----------------------------------
code: psmatch2 tx age year gr race, radius caliper(0.001)

The radius matching uses the nearly the full sample of the comparison group, and so it isn't appealing in that it leads to a loss in transparancy because I can't have the neatly displayed balance of co-variates across the unmatched and match cohorts for my Table 1.
Tags: propensity score matching, psmatch2, Rare outcomes, survival analysis
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#2

06 Feb 2016, 22:15

As far as I can see, propensity score matching any kind is not a a workable option. I recommend that you abandon the matching design and do instead a case-control study nested in the main study. (Google "nested case-control" study or "risk set sampling" (Langholtz and Clayton, 1994; Langholtz and Goldstein, 1995). This will allow you to utilize every failure. The followup analysis will conditional logistic regression. You can use the propensity score in that analysis as a covariate or in the form of IPW weights.

To do the risk-set sampling, I would ordinarily recommend the Stata command sttocc ("survival data to case-control". The "cases" are failures; the "controls" are non-failures. However sttocc will make things worse here. In a Cox analysis, a risk set contributes statistical information in proportion to the variation between predictors. Since surgical patients are 97% of your sample, they will constitute 97% of controls. In most risk sets, all controls will be surgical patients. When failure is also surgical patient, the risk set will drop from the analysis.

The solution is to form the risk sets by what is known as counter-matching (see the references). In the simplest case of 1-1 sampling, if the failure is in one group, the non-failure is randomly selected from the other, hence the name. This strategy ensures maximal variability between treatments in each risk set. You correct for the biased sampling with suitable weights.

Below is a do file for counter-matching for an arbitrary number of non-failures. Some points to consider. If there are m controls opposite in group to the failure, then m-1 controls inthe same group will be added, so that there are m subjects from each group.

The do file will produce a data set with the risk sets for the cconditional logistic regression). It will also create the probability weight needed for the analysis.

Some other points:

1. The do file splits tied failure times at random.

2. stsplit is used to form the risk sets for sampling the resulting data set will be huge. In your case, n = 6342 and you have 51 risk sets. So the first one will have about 6300 members; the second about 6300, with a possible number of observations greater than 300,000. (In the example of 337 patients and 80 failure times , the size of the risk set population is over 14,000). Before splitting, therefore, you should keep only variables needed to form the risk sets and compress the data. After, sampling, merge other information in.

43 You can use the propensity score asa covariate in the conditional logistic regression model. Or you can form inverse-probability-of-treatment weights for an ATT or an ATE analysis. Multiply the IPW weights by the counter-matching weights to get final weights for the logistic model. See the TEFFECTS manual for more information.

References:

Langholz, Bryan, and David Clayton. 1994. Sampling strategies in nested case-control studies. Environmental Health Perspectives 102, no. Suppl 8: 47.
http://www.ncbi.nlm.nih.gov/pmc/arti...00404-0049.pdf

Langholz, Bryan, and Larry Goldstein. 1996. Risk set sampling in epidemiologic cohort studies. Statistical Science 35-53.
http://projecteuclid.org/euclid.ss/1032209663

Code:

*********************************** * Counter-matched risk set sampling *********************************** * Set number of non-failures per risk set local m = 1 webuse diet, clear * Get 0-1 treatment group recode job 2 = 1, gen (group) // 0-1 treatment group) gen fevent = fail > 0 & fail<. * stset the data stset dox, failure(fevent) /// enter(time doe) id(id) origin(time dob) scale(365.25) * Break Tied failure times randomly bys _t: gen dup = _n if _d replace _t = _t + runiform()/100 if dup>1 drop dup * Reduce data set size keep id group _st _d _origin _t _t0 * Generate risk set identifier stsplit, at(failures) riskset(riskset) drop if riskset==. // alive at end * Get job totals: needed for weights egen n0 = total(group==0), by(riskset) egen n1 = total(group==1), by(riskset) * Identify case's group egen casegrp = total(group*_d), by(riskset) compress tempfile t0 t1 t2 t3 save `t0' * separate cases tempfile t1 keep if _d == 1 gen cmweight = 1 save `t1' /* Sample Non-failures */ use `t0', clear /* countermatched non-case group */ keep if _d==0 & group!=casegrp sample `m' , count by(riskset) gen cmweight = cond(group==0, n0/`m' , n1/`m') save `t2' /* Select m-1 non-failures with same group as case */ if `m'>1 { use `t0', clear keep if _d==0 & group == casegrp local mm = `m' -1 sample `mm', count by(riskset) gen cmweight = cond(group==0, (n0-1)/(`m'-1)/ , (n1-1)/(`m'-1)/ save `t3' } * Final data use `t1', clear append using `t2' if `m'>1 { append using `t3' } save cm01, replace

Code:

tab _d group 1 if | failure; 0 | RECODE of job if | (Occupation) censored | 0 1 | Total -----------+----------------------+---------- 0 | 57 23 | 80 1 | 23 57 | 80 -----------+----------------------+---------- Total | 80 80 | 160

Last edited by Steve Samuels; 06 Feb 2016, 23:13.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#3

06 Feb 2016, 23:36

Correction: I was incorrect in suggesting that countermatching weights be used as probability weights for conditional logistic regression: Probability weights in conditional logistic regression must be constant within a risk set, which the countermatching weights are not. The Langholtz articles recommend that the weights be used as offsetsin the conditional logistic model (option offset in clogit).

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#4

07 Feb 2016, 09:04

Let's step back for a moment. I don't think that the nested-case control design is going to be helpful here. The hormone group (\(n=5\) is simply too small. For a two-sample analysis of survival, the "effective" sample size per group is, roughly, the harmonic mean of the numbers of failures. Here, the harmonic mean is \( \dfrac{2 \times 5 \times 46}{5+46}= 9\), so the study is equivalent to one with 18 failures, not 51. Therefore the power of any test of difference will be very small.

The best that I can suggest is to use Gary King's cem command (coarsened exact matching "findit") on a very small number of matching factors. This, like any good matching method, will drop observations outside the area of common support, so expect to lose surgery patients and failures. Then you can look at the two Kaplan-Meier curves and make any other comparisons you wish.

As an aside, King says that matching on propensity scores is a bad idea in general: http://gking.harvard.edu/publication...ed-formatching
.

Last edited by Steve Samuels; 07 Feb 2016, 09:19.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Andrew Lover

Join Date: Apr 2014

Posts: 182
#5

07 Feb 2016, 22:56

Another suggestion is the -ReLogit- package (rare event logistic regression):

http://gking.harvard.edu/scholar_sof...sion/1-1-stata

I'd then see if the conclusions from the matched analysis (whatever flavor works) and -relogit- are consistent. If so, then you can feel more confident in the analysis, but if not then the problem might require further thought. Five failures is going to be difficult no matter what; some great discussion here from Richard Williams:

https://www3.nd.edu/~rwilliam/stats3/RareEvents.pdf

__________________________________________________ __
Assistant Professor, Department of Biostatistics and Epidemiology
School of Public Health and Health Sciences
University of Massachusetts- Amherst
Comment

Announcement

Propensity Score Matching prior to survival analysis in a cohort with an uncommon treatment and rare outcomes

Comment

Comment

Comment

Comment