Constructing a Time-to-Event Variable (days) with Multiple Potential Start Dates and a Two Potential End Dates

Sean Hardiman

Join Date: Sep 2016

Posts: 53
#1

Constructing a Time-to-Event Variable (days) with Multiple Potential Start Dates and a Two Potential End Dates

22 Jan 2022, 16:53

Hi everyone,

I want to create a variable that is a count of days between two events. In my study, there are two potential starting events (Procedure A and Procedure B) and two potential ending dates (Outcome or Study End). For Procedure A, there is one index date from which I want to start counting time. For procedure B, there is an index date, but this event may be followed by another event of the same type ('last procedure date'). For Procedure B, I want to start counting time from the latest of either the index date or the last procedure date. For both procedures, I want to count time to the date of either the Outcome or the Study End. Once I have this count in days, I want to convert it to weeks, months, and years (but that's a second step problem).

My thinking so far is I need code in the logic for the start date and then follow that up with code for the end date. The example code I have for this in SAS and uses the INTCK function, which offers limited value in Stata, and I'm struggling to figure out quite how to get started. I'd welcome any suggestions of resources to read or code to look at to help me construct this variable.

Thank you!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29795
#2

22 Jan 2022, 17:08

Without even trying hard, I can think of at least four differently organized data sets that would match what you have described about your data. Each of those would require an entirely different approach. I don't think anybody can help you without seeing example data. Please use the -dataex- command to do that. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
Sean Hardiman

Join Date: Sep 2016

Posts: 53
#3

23 Jan 2022, 11:48

Hi Clyde,

Thanks very much for your reply. Unfortunately, I'm prevented from sharing any of my data per terms of my data sharing agreement. Could you suggest alternatives that might be able to provide the information someone might need to offer help? Thinking I could share the relevant field names and their variable formatting?
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#4

23 Jan 2022, 12:26

I'm hung up on "potential" and what it means here, but my answer to my own question here may help.

You basically subtract the intervention date from itself. if it's 2001, -1=2000, 2002=1 and so on
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#5

23 Jan 2022, 12:33

Just as sort of a note, you can pick and choose what variables are in dataex versus not, so the "confidential data" argument really isn't good, since you can anonymize it if needed. Even if you just gave is the intervention variables and time ids, that would be a giant help.

Trust me, showing what your data look like is invaluable to people who could help you. Without seeing what the dataset looks like, I can't really give any comments.
Comment
Sean Hardiman

Join Date: Sep 2016

Posts: 53
#6

25 Jan 2022, 22:57

Hi Jared, thanks. I appreciate that the 'confidential data' argument presents difficulties. I can only share with you the direction I have from the data stewards, and my supervisor, which is pretty explicit in its language and the consequences for even a hint of violation. Sorry about that. That said, I do have variables and values I can share:

PROC_TYPE: 1 or 2

Possible start date variables:

INDEX_PROC_DATE ddMMMyyyy (may apply to both PROC_TYPE=1 or PROC_TYPE=2)
LAST_PROC_DATE, ddMMyyyy (only possible for patients who have PROC_TYPE=2)
OUTCOME, ddMMMyyyy

Study end for each patient is the occurrence of the outcome, three years from the INDEX_PROC_DATE, or the LAST_PROC_DATE, if it exists for patients with PROC_TYPE=2, or the end of the study period (December 31, 2019).

Does that help? Failing this, I could dummy up some data based on the fields if that helps.

Last edited by Sean Hardiman; 25 Jan 2022, 23:56.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#7

26 Jan 2022, 00:12

Sean Hardiman yeah even a synthetic dataset is beyond helpful, so long as it accurately represents whatever your issue happens to be. In fact I do this all the time, if I can't give a good example with dataex (maybe it would be too many observations?), I literally just make up an example with say 5 units and 6 time periods, not of my real data but just something I've made from wholecloth.

So yeah, so long as the toy dataset is a good faith replication of the issue, dummying it up works great too.
Comment

Sean Hardiman

Join Date: Sep 2016
Posts: 53

28 Jan 2022, 17:30

Thanks Jared, here's some made up data. Two procedure types. All patients have an index date. Some have a last procedure date but only if they are type 2 procedures. Death date occurs in some but not all patients. Follow-up is three years from index date, or study end date at 31Dec2019.

ID	PROC_TYPE	INDEX_PROC_DATE	LAST_PROC_DATE	DEATH_DATE
1	1	01Apr2013		06Dec2015
2	2	04Mar2010	12May2010	03Jan2014
3	2	03Nov2009
4	1	16Dec2014
5	1	17Apr2009
6	1	30Oct2013
6	2	07Dec2006	03Jan2007
8	2	11Jun2015		19Aug2015
9	1	14Apr2009
10	2	04Jun2009

Thanks!

Comment

Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#9

28 Jan 2022, 17:35

You need to use dataex so I can work with this.
1 like
Comment
Sean Hardiman

Join Date: Sep 2016

Posts: 53
#10

28 Jan 2022, 18:55

OK, thanks. Will do that!
Comment

Sean Hardiman

Join Date: Sep 2016
Posts: 53

#11

28 Jan 2022, 22:57

Thanks Jared, how's this?

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(id proc_type) str9(index_proc_date last_proc_date death_date)
 1 1 "01Apr2013" ""          "06Dec2015"
 2 2 "04Mar2010" "12May2010" "03Jan2014"
 3 2 "03Nov2009" ""          ""         
 4 1 "16Dec2014" ""          ""         
 5 1 "17Apr2009" ""          ""         
 6 1 "30Oct2013" ""          ""         
 6 2 "07Dec2006" "03Jan2007" ""         
 8 2 "11Jun2015" ""          "19Aug2015"
 9 1 "14Apr2009" ""          ""         
10 2 "04Jun2009" ""          ""         
end

Comment

Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#12

28 Jan 2022, 23:24

This is much better. I'll look at it when I get up tomorrow morning, I'm tired.
Comment
Sean Hardiman

Join Date: Sep 2016

Posts: 53
#13

29 Jan 2022, 13:14

Thank you! Appreciate you taking the time to help me!
Comment

Announcement

Constructing a Time-to-Event Variable (days) with Multiple Potential Start Dates and a Two Potential End Dates

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment