Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Overlaps of dates_LOOP

    Dear Stata,


    I have been using the following command to deal with overlaps:

    bysort pat_id (admindate3 disdate5): gen adm_tag=_n

    gen curadm=admindate3
    format curadm %d

    // carry admission date forward if last discharge if after current admission

    bys pat_id (admindate3 disdate5) : replace curadm = curadm[_n-1] if admindate3<=disdate5[_n-1] & adm_tag!=1
    // discharge is largest discharge date with current admission.

    bys pat_id curadm (disdate5) : gen curdis=disdate5[_N]
    format curdis %d


    However, this code doesnt seem to work for a dataset that looks like the data below:
    Patient Start date End date
    A 03/06/2010 29/06/2010
    A 04/06/2010 04/06/2010
    A 05/06/2010 05/06/2010
    A 06/06/2010 06/06/2010
    A 07/06/2010 07/06/2010
    A 08/06/2010 08/06/2010
    A 09/06/2010 09/06/2010
    A 10/06/2010 10/06/2010
    A 11/06/2010 11/06/2010
    A 12/06/2010 12/06/2010
    A 13/06/2010 13/06/2010
    I wanted the datset to replace all the dates that overlap and give the dataset below:
    Patient Start date End date Start date2 End date 2
    A 03/06/2010 29/06/2010 03/06/2010 29/06/2010
    A 04/06/2010 04/06/2010 03/06/2010 29/06/2010
    A 05/06/2010 05/06/2010 03/06/2010 29/06/2010
    A 06/06/2010 06/06/2010 03/06/2010 29/06/2010
    A 07/06/2010 07/06/2010 03/06/2010 29/06/2010
    A 08/06/2010 08/06/2010 03/06/2010 29/06/2010
    A 09/06/2010 09/06/2010 03/06/2010 29/06/2010
    A 10/06/2010 10/06/2010 03/06/2010 29/06/2010
    A 11/06/2010 11/06/2010 03/06/2010 29/06/2010
    A 12/06/2010 12/06/2010 03/06/2010 29/06/2010
    A 13/06/2010 13/06/2010 03/06/2010 29/06/2010
    If I wanted the above results then I would have to create a new variable each time and replace the dates SEE BELOW.
    Patient Start date End date Start date_1 End date_1 Start date_1 End date_1
    A 03/06/2010 29/06/2010 03/06/2010 29/06/2010 03/06/2010 29/06/2010
    A 04/06/2010 04/06/2010 03/06/2010 29/06/2010 03/06/2010 29/06/2010
    A 05/06/2010 05/06/2010 05/06/2010 05/06/2010 03/06/2010 29/06/2010
    A 06/06/2010 06/06/2010 06/06/2010 06/06/2010 06/06/2010 06/06/2010
    A 07/06/2010 07/06/2010 07/06/2010 07/06/2010 07/06/2010 07/06/2010
    A 08/06/2010 08/06/2010 08/06/2010 08/06/2010 08/06/2010 08/06/2010
    A 09/06/2010 09/06/2010 09/06/2010 09/06/2010 09/06/2010 09/06/2010
    A 10/06/2010 10/06/2010 10/06/2010 10/06/2010 10/06/2010 10/06/2010
    A 11/06/2010 11/06/2010 11/06/2010 11/06/2010 11/06/2010 11/06/2010
    A 12/06/2010 12/06/2010 12/06/2010 12/06/2010 12/06/2010 12/06/2010
    A 13/06/2010 13/06/2010 13/06/2010 13/06/2010 13/06/2010 13/06/2010

    I would be greatful for any advice or help.

  • #2
    I'm not sure if this is what you are looking for but the following example shows a technique to track overlapping stays.

    Code:
    clear
    input str1 pat_id str20 admission str20 discharge
    A "03/06/2010" "11/06/2010"
    A "04/06/2010" "04/06/2010"
    A "05/06/2010" "05/06/2010"
    A "06/06/2010" "06/06/2010"
    A "07/06/2010" "07/06/2010"
    A "08/06/2010" "08/06/2010"
    A "09/06/2010" "09/06/2010"
    A "10/06/2010" "10/06/2010"
    A "11/06/2010" "11/06/2010"
    A "12/06/2010" "12/06/2010"
    A "13/06/2010" "13/06/2010"
    B "10/06/2010" "12/06/2010"
    B "11/06/2010" "11/06/2010"
    B "12/06/2010" "12/06/2010"
    B "14/06/2010" "14/06/2010"
    end
    
    gen adate = date(admission,"DMY")
    gen ddate = date(discharge,"DMY")
    format %d *date
    
    * duplicate each observation and copy dates
    bysort pat_id: gen subid = _n
    expand 2
    bys pat_id subid: gen date2 = cond(_n==1,adate[1],ddate[2])
    format %d date2
    
    * identify admission and discharge events
    by pat_id subid: gen event = cond(_n==1,1,-1)
    
    * track stays by using a running sum of events; on any given 
    * day, put the admit events first
    gsort pat_id date2 -event
    by pat_id: gen eventsum = sum(event)
    by pat_id: gen stay = sum(_n == 1 | eventsum[_n-1] == 0)
    
    * record each stay's admission and discharge dates
    sort pat_id stay date2
    by pat_id stay: gen adateall = date2[1]
    by pat_id stay: gen ddateall = date2[_N]
    format %d *dateall
    
    * restore data to original format
    bys pat_id subid: keep if _n == 1
    list pat_id adate ddate adateall ddateall, noobs sepby(pat_id stay)

    Comment


    • #3
      Thank you very much for your help I this this would work. Could you explain why you expand it and why you chose this particular method (as its not what I would have thought of)?

      Comment


      • #4
        The question is addressed to Robert Picard, but

        1. Robert's comments are instructive and his whole strategy is very much based on getting a data structure in which dates of various kinds can all be put in the (natural) date order.

        2. What I take to be the same underlying idea was discussed independently in

        SJ-13-1 dm0068 . . . . . Stata tip 114: Expand paired dates to pairs of dates
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
        Q1/13 SJ 13(1):217--219 (no commands)
        tip on using expand to deal with paired dates

        http://www.stata-journal.com/article...article=dm0068

        Comment


        • #5
          The expand 2 strategy makes it possible to treat the admission date and discharge date as separate observations. When ordered by date, you can then track admissions (+1) and discharge (-1) events using a running sum. Since you can have overlapping stays, when the running sum reaches 0 the patient is not in the hospital anymore. I proposed a similar approach here as a solution to how to track the number of patients in an hospital emergency room when a new patient is admitted.

          Comment


          • #6
            Thank you that Robet it was helpful. Your comments with the stata code were very instructive.

            Comment


            • #7
              Thank you for your comment Nick will have a read.

              Comment


              • #8
                But you have to purchase the item.

                Comment


                • #9
                  Correct. but see e.g. thread starting at http://www.stata.com/statalist/archi.../msg01163.html for discussion.

                  Comment


                  • #10
                    Thank you Nick.

                    Comment


                    • #11
                      Dear Robert, if I wanted to count the number of overlaps, or identify the records which overlap,what would be the best way to do this?

                      Comment


                      • #12
                        *Or Nick

                        Comment


                        • #13
                          In my sample code, the variable stay identifies overlapping stays. You can add the following to count the number of overlapping observations per stay:

                          Code:
                          * identify the number of overlapping observations per stay
                          bys pat_id stay (adate subid): gen stayN = _N
                          list pat_id adate ddate adateall ddateall stay*, noobs sepby(pat_id stay)

                          Comment


                          • #14
                            Thank you that is helpful.

                            Comment


                            • #15
                              Hey, I was wondering if you could help me. I was trying to adapt the above code so that it only applies to patients that have a true overlap and not for patients who have a readmission on the same day.
                              Below is an example:

                              Original Using the above method What I want
                              Patients Start End Start1 End2 Starts2 Ends2
                              A 01/02/2010 02/03/2010 01/02/2010 05/03/2010 01/02/2010 02/03/2010
                              A 02/03/2010 05/03/2010 01/02/2010 05/03/2010 02/03/2010 05/03/2010
                              A 05/03/2010 05/03/2010 01/02/2010 05/03/2010 05/03/2010 05/03/2010
                              B 04/02/2009 28/02/2009 04/02/2009 28/02/2009 04/02/2009 28/02/2009
                              B 06/02/2009 07/02/2009 04/02/2009 28/02/2009 04/02/2009 28/02/2009
                              B 08/02/2009 08/02/2009 04/02/2009 28/02/2009 04/02/2009 28/02/2009
                              C 03/03/2009 18/03/2009 03/03/2009 18/03/2009 03/03/2009 18/03/2009
                              D 05/05/2010 12/05/2012 05/05/2010 12/05/2012 05/05/2010 12/05/2012
                              D 02/05/2012 04/05/2012 05/05/2010 12/05/2012 05/05/2010 12/05/2012

                              Any help will be appreciated

                              Comment

                              Working...
                              X