Cox Proportional Hazards regression for recurrent events

Chris Williams

Join Date: Oct 2015

Posts: 3
#1

Cox Proportional Hazards regression for recurrent events

07 Oct 2015, 06:28

Currently I am analyzing factors associated with a certain adverse event (A) in patients who are treated with drug (B). The data is from a longitudinal observational cohort (10 years of follow-up, patients were assessed annually, 11 assessments per patient in total). The data is sorted by patient ID and time (in years, range 0-10). The data is shaped long: each assessment is a single observation (row) in the database (so for 100 patients I have 1100 observations).

Some patients used drug B from the start of follow-up, while others started using this drug at a later time, so the moment they become at risk can vary on an indiviual level. In some patients the event of interest can occur multiple times over follow-up, so I want to do a cox regression for recurrent events to take into account all events.

Variables:

Code:

'ID' = patient ID 'time' = time in years of follow-up 'drug_b' = using drug B (scored 1/0) 'event_a' = adverse event of interest (scored 1/0) 'event_b' = other events, such as death (scored 1/0) (I want to censor patients from the moment event_b occurs)

I have come to the following code to set my data as survival data, using the Stata manual:

Code:

stset time, id(ID) failure(event_a==1) enter(time==0) origin(drug_b==1) exit(event_b==1 time .)

I have 2 questions regarding this analysis:
1. Would this be the correct code to set up the data as survival data for a recurrent event analysis (inspecting the data and the survival variables (_st _t _t0 _d) I think it is).

2. And a second question: based on the nature of my event of interest, I would think that the risk of this event occurring is higher when patients already had this event before. Once I identified my 'id' variable (like I did in the stset command above), does Stata take this into account? For example: if I have a patient with 1 event and another patient with 3 events over time, does Stata treat this as 4 totally different events (as if 4 patients all had 1 event, with ofcourse different time-to-event), or does it take into account that 3 events actually occurred in 1 patient and are possibly correlated? Does the 'cluster' option when doing the cox regression help with this (cluster patients by ID, using cluster(ID))?

Thanks in advance!
Tags: None
Andrew Lover

Join Date: Apr 2014

Posts: 182
#2

07 Oct 2015, 20:58

FAQ here:

http://www.stata.com/support/faqs/st...ure-time-data/

The issue with higher-risk post one event can be handled as 'frailty'

http://www.stata.com/statalist/archi.../msg00497.html

__________________________________________________ __
Assistant Professor, Department of Biostatistics and Epidemiology
School of Public Health and Health Sciences
University of Massachusetts- Amherst
Comment
John Moran

Join Date: Oct 2015

Posts: 17
#3

08 Oct 2015, 00:11

Further to Andrew Lover's comments, the issue of "fraility" and recurrent events has been handled formally in Xu Y, Cheung YB. Frailty models and frailty-mixture models for recurrent event times. Stata Journal. 2015;15(1):135-154 (see also: Xu Y, Cheung YB, Lam KF, Milligan P. Estimation of summary protective efficacy using a frailty mixture model for recurrent event time data. Statistics in medicine. Dec 20 2012;31(29):4023-4039)

john moran
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#4

08 Oct 2015, 10:14

We can't tell from the information you've provided so far, but if your data are heavily grouped, use a discrete/grouped data method. See this 2010 presentation by Fiona Steele. The analyses can be done with, e.g. Stata's mixed model meqrlogit.

Your stset would not have been correct in any case. The origin statement would have excluded from the analysis all exposure without drug_b. Also, patients experience before drug_b started should be included in the analysis. If not, you would have a biased sample of experience without drug_b. The way to analyze drug_b is as a time-dependent binary covariate. All of this can be handled in a grouped data setup.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#5

08 Oct 2015, 11:36

Steve gives great (and generous) advice as ever. My small remark is that Fiona has some related materials, dated 2013, available too: see "Course on multilevel discrete-time event history analysis (1.8MB). Materials include lectures slides, Stata practicals, datasets and syntax." at http://http://stats.lse.ac.uk/steele/
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#6

09 Oct 2015, 06:27

Your original question was how Stata keeps track of prior events. The answer is that it won't unless you explicitly tell it how to. I have not yet read the references John, Andrew, and Stephen gave, but these will have examples. I'm sure. See also Sections 3.23 and 3.24 of this 2009 FAQ by Mario Cleves on multiple failure time data.

Last edited by Steve Samuels; 09 Oct 2015, 06:36.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Chris Williams

Join Date: Oct 2015

Posts: 3
#7

16 Oct 2015, 09:02

Thanks Steve, Stephen, John and Andrew for your helpful comments. I have been looking into both the guide by Mario Cleves and the frailty model.

Originally posted by Steve Samuels View Post

Your stset would not have been correct in any case. The origin statement would have excluded from the analysis all exposure without drug_b. Also, patients experience before drug_b started should be included in the analysis. If not, you would have a biased sample of experience without drug_b. The way to analyze drug_b is as a time-dependent binary covariate. All of this can be handled in a grouped data setup.

Should I also include drug_b as a time-dependent binary covariate if patients cannot experience the failure event at a moment that they are not exposed to drug_b (in other words, biologically it would be impossible for them to experience the failure event before they started with drug_b, and my hypothesis is related to certain predictors explaning the occurrence of the failure event in drug_b users)?
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#8

20 Oct 2015, 11:45

Hmm. I apologize. Your original post clearly stated that you were interested only in events of patients on drug b. I misread it to mean that you were also interested in a comparison of drug b experience to experience without drug b. You are not, so you can analyze a data set of patients with drug b only. One question: does the duration of illness prior to the start of drug b have any impact on the likelihood of the failure event? If so, you can define the entry() option to be the date that drug B started, with the origin() the start of illness. This would compare people on drug b who had been at ill equal lengths of time. But perhaps time on drug b is more important. Then, set the origin to be the start of drug b, but include information duration of illness as either a fixed predictor (duration when drug b started) or as a time-varying predictor. Note that you still have to decide whether to restart the clock after each failure event or not for the recurrent event analysis.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Chris Williams

Join Date: Oct 2015

Posts: 3
#9

21 Oct 2015, 08:30

Originally posted by Steve Samuels View Post

Hmm. I apologize. Your original post clearly stated that you were interested only in events of patients on drug b. I misread it to mean that you were also interested in a comparison of drug b experience to experience without drug b. You are not, so you can analyze a data set of patients with drug b only. One question: does the duration of illness prior to the start of drug b have any impact on the likelihood of the failure event? If so, you can define the entry() option to be the date that drug B started, with the origin() the start of illness. This would compare people on drug b who had been at ill equal lengths of time. But perhaps time on drug b is more important. Then, set the origin to be the start of drug b, but include information duration of illness as either a fixed predictor (duration when drug b started) or as a time-varying predictor. Note that you still have to decide whether to restart the clock after each failure event or not for the recurrent event analysis.

Thank you for the clarification. For now I have chosen the second approach (set origin to start of drug b, and I have included illness duration as a fixed predictor). With regards to restarting the clock: at the moment I am restarting the clock (by generating new IDs per failure event, as described in the Stata stcox manual). We plan on doing a subanalysis using 'failure event in first x [to be determined] years of follow-up' as a predictor for later (re)occurrence of failure events.

I have two final, more basic, questions with regards to cox regression and fixed/time-varying predictors. Let's assume I am just doing time-to-first-failure analysis (not recurrent event analysis) using cox regression.

To give an example on my data, let's say this is my dataset:
Patient_ID Time Age Drug B Failure event Pain (0-10)

1 0 35 1 0 3.5

1 1 36 1 0 4.2

1 2 37 1 0 4.3

1 3 38 1 1 5.1

2 0 24 0 0 6.1

2 1 25 1 0 5.4

2 2 26 1 0 5.1

2 3 27 1 0 6.0

1. As I have panel data (multiple but equal number of observations per patient, at each observation the follow-up time is specified), is specifying the 'id'-option ('id(Patient_ID)') in the stset command sufficient for Stata to take into account these multiple observations per patient?

2. Related to the first question: As my data is panel data, do predictors that vary over time (for example, 'Age', or 'Pain' in the data above) need to be specified as time-varying just because they vary during follow-up? Or is the time-varying option best used if you expect the effect of a predictor to vary over time? When reading http://www.stata.com/statalist/archi.../msg00687.html I think I don't have to specify them as tvc (I am finding 'stsplit' being mentioned a lot and read the Stata manual on stsplit, but to me I don't see the benefits of splitting my data?)

Last edited by Chris Williams; 21 Oct 2015, 08:33.
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#10

21 Oct 2015, 10:56

You apparently missed my October 8 comment about a grouped data analysis, Chris. As your new questions have nothing to do with recurrent events, I suggest that you start a new topic.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
LarsFolkestad

Join Date: Sep 2014

Posts: 165
#11

22 Oct 2015, 01:10

Originally posted by Stephen Jenkins View Post

Steve gives great (and generous) advice as ever. My small remark is that Fiona has some related materials, dated 2013, available too: see "Course on multilevel discrete-time event history analysis (1.8MB). Materials include lectures slides, Stata practicals, datasets and syntax." at http://http://stats.lse.ac.uk/steele/

it seems that the link is dead. Can you provide a new link?

lars
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#12

22 Oct 2015, 03:25

Sorry, a gremlin introduced an extra "http://". Try this URL (easily found via Google too): http://stats.lse.ac.uk/steele/
Comment

Patient_ID	Time	Age	Drug B	Failure event	Pain (0-10)
1	0	35	1	0	3.5
1	1	36	1	0	4.2
1	2	37	1	0	4.3
1	3	38	1	1	5.1
2	0	24	0	0	6.1
2	1	25	1	0	5.4
2	2	26	1	0	5.1
2	3	27	1	0	6.0

Announcement

Cox Proportional Hazards regression for recurrent events

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment