Creating start and stop time-variables for continuous time dependent exposure and running a Cox model

David Allen

Join Date: Nov 2019

Posts: 3
#1

Creating start and stop time-variables for continuous time dependent exposure and running a Cox model

09 Nov 2019, 04:32

Hi All

I hope you can help me with this query. I have attached a simplified dataset with all the variables needed (entrydate and exitdate are the start and end of the followup period and time is time at risk). Three individuals had outcome and three didn't. Exposure is a time varying covariate (repeated measure) and I want to create sequential start and stop time-variables corresponding to the exposures for each individual before running a Cox model. However, I am completely stuck with that and need help with coding to create these additional variables and run the Cox model afterwards. I have searched the forum and other sites, but I didn't find what I needed. The data looks like this:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte id float(entrydate exitdate exposure exposuredate time outcome outcomedate) 1 20490 21185 51 20643 695 0 . 1 20490 21185 53 20951 695 0 . 2 20402 21185 50 20426 783 0 . 2 20402 21185 52 20646 783 0 . 2 20402 21185 47 20667 783 0 . 3 20573 21185 47 20612 612 0 . 3 20573 21185 48 20955 612 0 . 3 20573 21185 47 21062 612 0 . 4 20430 21111 50 20579 681 1 21111 4 20430 21111 49 20726 681 1 21111 4 20430 21111 47 20950 681 1 21111 5 19640 20473 42 19950 833 1 20473 5 19640 20473 39 20048 833 1 20473 5 19640 20473 43 20431 833 1 20473 6 18441 18483 79 18445 42 1 18483 6 18441 18483 78 18445 42 1 18483 end format %tdDD/NN/CCYY entrydate format %tdDD/NN/CCYY exitdate format %tdDD/NN/CCYY exposuredate format %tdDD/NN/CCYY outcomedate

Please help
Thanks
Tags: None

David Allen

Join Date: Nov 2019
Posts: 3

09 Nov 2019, 15:36

Hi again

I realised I would need a baseline exposure reading to be carried forward as a starting point before the first repeated exposure. I have added this baseline measure (exposure_0) taken on the entrydate in the new dataset below. So the question remains as above but with the added baseline exposure (perhaps an extra row per individual will be needed to incorporate this when creating start, stop variables?)

New data looks like this:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte id float(exposure_0 entrydate exitdate exposure exposuredate time outcome outcomedate)
1 43 20490 21185 51 20643 695 0     .
1 43 20490 21185 53 20951 695 0     .
2 66 20402 21185 50 20426 783 0     .
2 66 20402 21185 52 20646 783 0     .
2 66 20402 21185 47 20667 783 0     .
3 77 20573 21185 47 20612 612 0     .
3 77 20573 21185 48 20955 612 0     .
3 77 20573 21185 47 21062 612 0     .
4 60 20430 21111 50 20579 681 1 21111
4 60 20430 21111 49 20726 681 1 21111
4 60 20430 21111 47 20950 681 1 21111
5 56 19640 20473 42 19950 833 1 20473
5 56 19640 20473 39 20048 833 1 20473
5 56 19640 20473 43 20431 833 1 20473
6 45 18441 18483 79 18445  42 1 18483
6 45 18441 18483 78 18445  42 1 18483
end
format %tdDD/NN/CCYY entrydate
format %tdDD/NN/CCYY exitdate
format %tdDD/NN/CCYY exposuredate
format %tdDD/NN/CCYY outcomedate

Am grateful for any help.

Last edited by David Allen; 09 Nov 2019, 15:44.

Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30100

17 Nov 2019, 14:05

So this is a complicated organization of this data that does not lend itself readily to use in Stata's survival analysis programs. It requires some major revisions.

First, let me call your attention to an error in your data, which the code I show does not attempt to fix. For id 6, there are two observations with exposuredate = 18445, and, on top of that, the exposure values are different. If this person actually had exposure measured twice on the same date, you need to combine the results in some way. It is more likely, I guess, that one of the two exposure dates is an error, however.

You have multiple events for each person. So the layout that Stata needs is one observation for each time interval during which everything remains constant, including a separate observation for entering and exiting the study. In this layout, the outcome should be designated as occurring only in the final observation. For those who never experienced an outcome, the variable should be coded 0 for all of his/her observations. I think the code below does what you want:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte id float(exposure_0 entrydate exitdate exposure exposuredate time outcome outcomedate)
1 43 20490 21185 51 20643 695 0     .
1 43 20490 21185 53 20951 695 0     .
2 66 20402 21185 50 20426 783 0     .
2 66 20402 21185 52 20646 783 0     .
2 66 20402 21185 47 20667 783 0     .
3 77 20573 21185 47 20612 612 0     .
3 77 20573 21185 48 20955 612 0     .
3 77 20573 21185 47 21062 612 0     .
4 60 20430 21111 50 20579 681 1 21111
4 60 20430 21111 49 20726 681 1 21111
4 60 20430 21111 47 20950 681 1 21111
5 56 19640 20473 42 19950 833 1 20473
5 56 19640 20473 39 20048 833 1 20473
5 56 19640 20473 43 20431 833 1 20473
6 45 18441 18483 79 18445  42 1 18483
6 45 18441 18483 78 18445  42 1 18483
end
format %tdDD/NN/CCYY entrydate
format %tdDD/NN/CCYY exitdate
format %tdDD/NN/CCYY exposuredate
format %tdDD/NN/CCYY outcomedate

assert outcomedate == exitdate if outcome
assert missing(outcomedate) if !outcome
drop outcomedate


by id (exposuredate), sort: gen expander = cond(_n == 1 | _n == _N, 2, 1)
expand expander
drop expander
by id (exposuredate), sort: replace exposure = exposure_0 if _n == 1
gen date = exposuredate
format date %tdDD/NN/CCYY
by id (exposuredate): replace date = entrydate if _n == 1
by id (exposuredate): replace date = exitdate if _n == _N
by id (exposuredate): replace outcome = 0 if _n < _N
by id (exposuredate): gen elapsed_days = date - date[1]

stset elapsed_days, id(id) fail(outcome==1)

Comment

David Allen

Join Date: Nov 2019

Posts: 3
#4

17 Nov 2019, 15:46

Thank you so much Clyde. This definitely seem to do the trick, and the start, stop times follow on as they should (I need to go through each line carefully to make sure that I understand what it does!). I am assuming that after stset, the stcox specification doesn't change at all. i.e: stcox independent_vars (age gender deprivation etc...)

Best regards
David
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#5

17 Nov 2019, 16:12

Yes, just use -stcox- in the usual way.
Comment

Announcement