I am seeking advice for which propensity score matching technique should be used when preparing a matched dataset to compare survival when failure events are rare outcomes. I am working with data from a cancer registry and comparing cancer-specific survival between two treatment types (hormone vs. surgery) using a Cox Proportional Hazards model.
In my unmatched dataset, the groups unbalanced covariate distributions for age, year of diagnosis, marital status and tumour grade. Cohort size is n=164 (2.5%) in the hormone group and n=6178 (97.5%) in the surgery group. There are 5 failure events in the hormone group and 46 failure events in the surgery group. Cancer-specific death is a very rare outcome.
To adjust for the selection bias in treatment allocation to hormone therapy, I would like to use propensity score matching. I have tried doing this using the 'psmatch2' and various matching modalities. Ideally, I would like to use a fixed matching technique so that each treated patient has a set number of controls and the balance of characteristics in the unmatched and matched cohorts can be displayed clearly in a "table 1" format along with a statistical test of difference (t-test or chi-square as appropriate).
Please see below for my attempts to date. A big thanks in advance to those who have insight on how to approach this puzzle!
---Attempt #1: 1-to-1 nearest neighbour matching ---------------------
code: psmatch2 tx age year gr race neighbor(1) noreplacement
1:1 nearest neighbour matching creates a well-balanced cohort, but there are no failure events in my comparison (surgery group) and so I cannot use the matched cohort to model survival probabilities or hazard ratios.
--- Attempt #2: 1-to-n nearest neighbour matching with a caliper -------------------
code: psmatch2 tx age year gr race, neighbor(5) caliper(0.01)
1 to many nearest neighbour matching is the ideal scenario. I would like to have at least 5 comparison controls and use all of my treated group. But I need to increase the number of controls to 5 in order to have at least 1 failure outcome in the control group and at this point, the co-variates are no longer well-balanced between groups.
--- Attempt #3: Radius matching ----------------------------------
code: psmatch2 tx age year gr race, radius caliper(0.001)
The radius matching uses the nearly the full sample of the comparison group, and so it isn't appealing in that it leads to a loss in transparancy because I can't have the neatly displayed balance of co-variates across the unmatched and match cohorts for my Table 1.
In my unmatched dataset, the groups unbalanced covariate distributions for age, year of diagnosis, marital status and tumour grade. Cohort size is n=164 (2.5%) in the hormone group and n=6178 (97.5%) in the surgery group. There are 5 failure events in the hormone group and 46 failure events in the surgery group. Cancer-specific death is a very rare outcome.
To adjust for the selection bias in treatment allocation to hormone therapy, I would like to use propensity score matching. I have tried doing this using the 'psmatch2' and various matching modalities. Ideally, I would like to use a fixed matching technique so that each treated patient has a set number of controls and the balance of characteristics in the unmatched and matched cohorts can be displayed clearly in a "table 1" format along with a statistical test of difference (t-test or chi-square as appropriate).
Please see below for my attempts to date. A big thanks in advance to those who have insight on how to approach this puzzle!
---Attempt #1: 1-to-1 nearest neighbour matching ---------------------
code: psmatch2 tx age year gr race neighbor(1) noreplacement
1:1 nearest neighbour matching creates a well-balanced cohort, but there are no failure events in my comparison (surgery group) and so I cannot use the matched cohort to model survival probabilities or hazard ratios.
--- Attempt #2: 1-to-n nearest neighbour matching with a caliper -------------------
code: psmatch2 tx age year gr race, neighbor(5) caliper(0.01)
1 to many nearest neighbour matching is the ideal scenario. I would like to have at least 5 comparison controls and use all of my treated group. But I need to increase the number of controls to 5 in order to have at least 1 failure outcome in the control group and at this point, the co-variates are no longer well-balanced between groups.
--- Attempt #3: Radius matching ----------------------------------
code: psmatch2 tx age year gr race, radius caliper(0.001)
The radius matching uses the nearly the full sample of the comparison group, and so it isn't appealing in that it leads to a loss in transparancy because I can't have the neatly displayed balance of co-variates across the unmatched and match cohorts for my Table 1.
Comment