Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Constructing a Time-to-Event Variable (days) with Multiple Potential Start Dates and a Two Potential End Dates

    Hi everyone,

    I want to create a variable that is a count of days between two events. In my study, there are two potential starting events (Procedure A and Procedure B) and two potential ending dates (Outcome or Study End). For Procedure A, there is one index date from which I want to start counting time. For procedure B, there is an index date, but this event may be followed by another event of the same type ('last procedure date'). For Procedure B, I want to start counting time from the latest of either the index date or the last procedure date. For both procedures, I want to count time to the date of either the Outcome or the Study End. Once I have this count in days, I want to convert it to weeks, months, and years (but that's a second step problem).

    My thinking so far is I need code in the logic for the start date and then follow that up with code for the end date. The example code I have for this in SAS and uses the INTCK function, which offers limited value in Stata, and I'm struggling to figure out quite how to get started. I'd welcome any suggestions of resources to read or code to look at to help me construct this variable.

    Thank you!

  • #2
    Without even trying hard, I can think of at least four differently organized data sets that would match what you have described about your data. Each of those would require an entirely different approach. I don't think anybody can help you without seeing example data. Please use the -dataex- command to do that. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Hi Clyde,

      Thanks very much for your reply. Unfortunately, I'm prevented from sharing any of my data per terms of my data sharing agreement. Could you suggest alternatives that might be able to provide the information someone might need to offer help? Thinking I could share the relevant field names and their variable formatting?

      Comment


      • #4
        I'm hung up on "potential" and what it means here, but my answer to my own question here may help.

        You basically subtract the intervention date from itself. if it's 2001, -1=2000, 2002=1 and so on

        Comment


        • #5
          Just as sort of a note, you can pick and choose what variables are in dataex versus not, so the "confidential data" argument really isn't good, since you can anonymize it if needed. Even if you just gave is the intervention variables and time ids, that would be a giant help.

          Trust me, showing what your data look like is invaluable to people who could help you. Without seeing what the dataset looks like, I can't really give any comments.

          Comment


          • #6
            Hi Jared, thanks. I appreciate that the 'confidential data' argument presents difficulties. I can only share with you the direction I have from the data stewards, and my supervisor, which is pretty explicit in its language and the consequences for even a hint of violation. Sorry about that. That said, I do have variables and values I can share:

            PROC_TYPE: 1 or 2

            Possible start date variables:

            INDEX_PROC_DATE ddMMMyyyy (may apply to both PROC_TYPE=1 or PROC_TYPE=2)
            LAST_PROC_DATE, ddMMyyyy (only possible for patients who have PROC_TYPE=2)
            OUTCOME, ddMMMyyyy

            Study end for each patient is the occurrence of the outcome, three years from the INDEX_PROC_DATE, or the LAST_PROC_DATE, if it exists for patients with PROC_TYPE=2, or the end of the study period (December 31, 2019).

            Does that help? Failing this, I could dummy up some data based on the fields if that helps.
            Last edited by Sean Hardiman; 25 Jan 2022, 23:56.

            Comment


            • #7
              Sean Hardiman yeah even a synthetic dataset is beyond helpful, so long as it accurately represents whatever your issue happens to be. In fact I do this all the time, if I can't give a good example with dataex (maybe it would be too many observations?), I literally just make up an example with say 5 units and 6 time periods, not of my real data but just something I've made from wholecloth.

              So yeah, so long as the toy dataset is a good faith replication of the issue, dummying it up works great too.

              Comment


              • #8
                Thanks Jared, here's some made up data. Two procedure types. All patients have an index date. Some have a last procedure date but only if they are type 2 procedures. Death date occurs in some but not all patients. Follow-up is three years from index date, or study end date at 31Dec2019.

                ID PROC_TYPE INDEX_PROC_DATE LAST_PROC_DATE DEATH_DATE
                1 1 01Apr2013 06Dec2015
                2 2 04Mar2010 12May2010 03Jan2014
                3 2 03Nov2009
                4 1 16Dec2014
                5 1 17Apr2009
                6 1 30Oct2013
                6 2 07Dec2006 03Jan2007
                8 2 11Jun2015 19Aug2015
                9 1 14Apr2009
                10 2 04Jun2009

                Thanks!

                Comment


                • #9
                  You need to use dataex so I can work with this.

                  Comment


                  • #10
                    OK, thanks. Will do that!

                    Comment


                    • #11
                      Thanks Jared, how's this?

                      Code:
                      * Example generated by -dataex-. For more info, type help dataex
                      clear
                      input byte(id proc_type) str9(index_proc_date last_proc_date death_date)
                       1 1 "01Apr2013" ""          "06Dec2015"
                       2 2 "04Mar2010" "12May2010" "03Jan2014"
                       3 2 "03Nov2009" ""          ""         
                       4 1 "16Dec2014" ""          ""         
                       5 1 "17Apr2009" ""          ""         
                       6 1 "30Oct2013" ""          ""         
                       6 2 "07Dec2006" "03Jan2007" ""         
                       8 2 "11Jun2015" ""          "19Aug2015"
                       9 1 "14Apr2009" ""          ""         
                      10 2 "04Jun2009" ""          ""         
                      end

                      Comment


                      • #12
                        This is much better. I'll look at it when I get up tomorrow morning, I'm tired.

                        Comment


                        • #13
                          Thank you! Appreciate you taking the time to help me!

                          Comment

                          Working...
                          X