Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating days admitted in hospital

    Hi,

    I need to calculate days admitted to hospital for the first 30 days after surgery. I do not want to include the actual day of surgery. Maximum stay is hence 30 days. The dataset comprises a little more than 1 million patients. I have provided a dataex with a few problematic example patients.

    ID is patient-id.
    SurgeryDate is date of surgery.
    InDate is date of admission to a hospital or ward within the same hospital
    OutDate is date of discharge from hospital or from a ward.

    Each ID could have several hospital/ward stays, so I need to sum these for each ID. My problem is that patients are often moved between wards (or hospitals) so the discharge date is often the same as the next observations admission date (for example ID 4, discharged 20apr2020, admitted 20apr2020). Another problem is that hospital stays are sometimes overlapping each other (ID 6 is a horrible example of this). Sometimes the day of surgery is not included in a hospital stay but is performed as day-care-surgery (ID 5).

    So for example, ID 7 should have a total of five days admitted: First observation 0 days - Surgery date, excluded. Second obs 1 day - 13 feb. Third obs 1 day - 14 feb. Fourth obs 3 days - 15-17 feb.

    I can sum the days of all hospital admissions, and subtract a day for the surgery date. But I am struggling to find a way that deals with combinations of overlapping stays and repeated admission dates. Any ideas?

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float ID int(SurgeryDate InDate OutDate)
    1 22261 22261 22261
    1 22261 22262 22263
    2 22597 22595 22601
    3 22806 22779 22806
    3 22806 22805 22812
    3 22806 22813 22827
    4 22025 22024 22025
    4 22025 22025 22028
    4 22025 22031 22040
    4 22025 22042 22042
    4 22025 22047 22047
    5 21671 21672 21672
    5 21671 21672 21674
    5 21671 21673 21673
    5 21671 21680 21687
    5 21671 21698 21701
    6 22151 22151 22152
    6 22151 22151 22153
    6 22151 22153 22154
    6 22151 22154 22159
    6 22151 22174 22175
    6 22151 22175 22178
    6 22151 22179 22181
    6 22151 22181 22181
    7 22688 22688 22688
    7 22688 22688 22689
    7 22688 22688 22690
    7 22688 22690 22693
    end
    format %td SurgeryDate
    format %td InDate
    format %td OutDate

  • #2
    Code:
    . // you only look till 30 days after surgery
    . gen CensoredOut = min(SurgeryDate+30, OutDate)
    
    .
    . // Find the last time within those 30 days that the patient left a ward
    . bysort ID (InDate OutDate) : egen FinalOut = max(CensoredOut)
    
    .
    . // the difference is the length of stay (los)
    . gen los = FinalOut - SurgeryDate
    
    .
    . // however some (e.g. patient 4) were sometimes not in a ward between
    . // surgery and the final discharge (presumably they were home)
    . bysort ID (InDate OutDate) : gen DaysHome = max(0,InDate[_n+1] - CensoredOut[_n]-1)
    
    .
    . // add up all the days home
    . bysort ID (InDate OutDate) : egen DaysHomeTot = total(DaysHome)
    
    .
    . // subtract the days home from los
    . replace los = los - DaysHomeTot
    (18 real changes made)
    
    .
    . // admire the result
    . format CensoredOut FinalOut %td
    
    . l, sepby(ID)
    
         +--------------------------------------------------------------------------------------------+
         | ID   Surgery~e      InDate     OutDate   Censore~t    FinalOut   los   DaysHome   DaysHo~t |
         |--------------------------------------------------------------------------------------------|
      1. |  1   12dec2020   12dec2020   12dec2020   12dec2020   14dec2020     2          0          0 |
      2. |  1   12dec2020   13dec2020   14dec2020   14dec2020   14dec2020     2          0          0 |
         |--------------------------------------------------------------------------------------------|
      3. |  2   13nov2021   11nov2021   17nov2021   17nov2021   17nov2021     4          0          0 |
         |--------------------------------------------------------------------------------------------|
      4. |  3   10jun2022   14may2022   10jun2022   10jun2022   01jul2022    21          0          0 |
      5. |  3   10jun2022   09jun2022   16jun2022   16jun2022   01jul2022    21          0          0 |
      6. |  3   10jun2022   17jun2022   01jul2022   01jul2022   01jul2022    21          0          0 |
         |--------------------------------------------------------------------------------------------|
      7. |  4   20apr2020   19apr2020   20apr2020   20apr2020   12may2020    15          0          7 |
      8. |  4   20apr2020   20apr2020   23apr2020   23apr2020   12may2020    15          2          7 |
      9. |  4   20apr2020   26apr2020   05may2020   05may2020   12may2020    15          1          7 |
     10. |  4   20apr2020   07may2020   07may2020   07may2020   12may2020    15          4          7 |
     11. |  4   20apr2020   12may2020   12may2020   12may2020   12may2020    15          0          7 |
         |--------------------------------------------------------------------------------------------|
     12. |  5   02may2019   03may2019   03may2019   03may2019   01jun2019    14          0         16 |
     13. |  5   02may2019   03may2019   05may2019   05may2019   01jun2019    14          0         16 |
     14. |  5   02may2019   04may2019   04may2019   04may2019   01jun2019    14          6         16 |
     15. |  5   02may2019   11may2019   18may2019   18may2019   01jun2019    14         10         16 |
     16. |  5   02may2019   29may2019   01jun2019   01jun2019   01jun2019    14          0         16 |
         |--------------------------------------------------------------------------------------------|
     17. |  6   24aug2020   24aug2020   25aug2020   25aug2020   23sep2020    16          0         14 |
     18. |  6   24aug2020   24aug2020   26aug2020   26aug2020   23sep2020    16          0         14 |
     19. |  6   24aug2020   26aug2020   27aug2020   27aug2020   23sep2020    16          0         14 |
     20. |  6   24aug2020   27aug2020   01sep2020   01sep2020   23sep2020    16         14         14 |
     21. |  6   24aug2020   16sep2020   17sep2020   17sep2020   23sep2020    16          0         14 |
     22. |  6   24aug2020   17sep2020   20sep2020   20sep2020   23sep2020    16          0         14 |
     23. |  6   24aug2020   21sep2020   23sep2020   23sep2020   23sep2020    16          0         14 |
     24. |  6   24aug2020   23sep2020   23sep2020   23sep2020   23sep2020    16          0         14 |
         |--------------------------------------------------------------------------------------------|
     25. |  7   12feb2022   12feb2022   12feb2022   12feb2022   17feb2022     5          0          0 |
     26. |  7   12feb2022   12feb2022   13feb2022   13feb2022   17feb2022     5          0          0 |
     27. |  7   12feb2022   12feb2022   14feb2022   14feb2022   17feb2022     5          0          0 |
     28. |  7   12feb2022   14feb2022   17feb2022   17feb2022   17feb2022     5          0          0 |
         +--------------------------------------------------------------------------------------------+
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Originally posted by Maarten Buis View Post
      Code:
      . // you only look till 30 days after surgery
      . gen CensoredOut = min(SurgeryDate+30, OutDate)
      
      .
      . // Find the last time within those 30 days that the patient left a ward
      . bysort ID (InDate OutDate) : egen FinalOut = max(CensoredOut)
      
      .
      . // the difference is the length of stay (los)
      . gen los = FinalOut - SurgeryDate
      
      .
      . // however some (e.g. patient 4) were sometimes not in a ward between
      . // surgery and the final discharge (presumably they were home)
      . bysort ID (InDate OutDate) : gen DaysHome = max(0,InDate[_n+1] - CensoredOut[_n]-1)
      
      .
      . // add up all the days home
      . bysort ID (InDate OutDate) : egen DaysHomeTot = total(DaysHome)
      
      .
      . // subtract the days home from los
      . replace los = los - DaysHomeTot
      (18 real changes made)
      
      .
      . // admire the result
      . format CensoredOut FinalOut %td
      
      . l, sepby(ID)
      
      +--------------------------------------------------------------------------------------------+
      | ID Surgery~e InDate OutDate Censore~t FinalOut los DaysHome DaysHo~t |
      |--------------------------------------------------------------------------------------------|
      1. | 1 12dec2020 12dec2020 12dec2020 12dec2020 14dec2020 2 0 0 |
      2. | 1 12dec2020 13dec2020 14dec2020 14dec2020 14dec2020 2 0 0 |
      |--------------------------------------------------------------------------------------------|
      3. | 2 13nov2021 11nov2021 17nov2021 17nov2021 17nov2021 4 0 0 |
      |--------------------------------------------------------------------------------------------|
      4. | 3 10jun2022 14may2022 10jun2022 10jun2022 01jul2022 21 0 0 |
      5. | 3 10jun2022 09jun2022 16jun2022 16jun2022 01jul2022 21 0 0 |
      6. | 3 10jun2022 17jun2022 01jul2022 01jul2022 01jul2022 21 0 0 |
      |--------------------------------------------------------------------------------------------|
      7. | 4 20apr2020 19apr2020 20apr2020 20apr2020 12may2020 15 0 7 |
      8. | 4 20apr2020 20apr2020 23apr2020 23apr2020 12may2020 15 2 7 |
      9. | 4 20apr2020 26apr2020 05may2020 05may2020 12may2020 15 1 7 |
      10. | 4 20apr2020 07may2020 07may2020 07may2020 12may2020 15 4 7 |
      11. | 4 20apr2020 12may2020 12may2020 12may2020 12may2020 15 0 7 |
      |--------------------------------------------------------------------------------------------|
      12. | 5 02may2019 03may2019 03may2019 03may2019 01jun2019 14 0 16 |
      13. | 5 02may2019 03may2019 05may2019 05may2019 01jun2019 14 0 16 |
      14. | 5 02may2019 04may2019 04may2019 04may2019 01jun2019 14 6 16 |
      15. | 5 02may2019 11may2019 18may2019 18may2019 01jun2019 14 10 16 |
      16. | 5 02may2019 29may2019 01jun2019 01jun2019 01jun2019 14 0 16 |
      |--------------------------------------------------------------------------------------------|
      17. | 6 24aug2020 24aug2020 25aug2020 25aug2020 23sep2020 16 0 14 |
      18. | 6 24aug2020 24aug2020 26aug2020 26aug2020 23sep2020 16 0 14 |
      19. | 6 24aug2020 26aug2020 27aug2020 27aug2020 23sep2020 16 0 14 |
      20. | 6 24aug2020 27aug2020 01sep2020 01sep2020 23sep2020 16 14 14 |
      21. | 6 24aug2020 16sep2020 17sep2020 17sep2020 23sep2020 16 0 14 |
      22. | 6 24aug2020 17sep2020 20sep2020 20sep2020 23sep2020 16 0 14 |
      23. | 6 24aug2020 21sep2020 23sep2020 23sep2020 23sep2020 16 0 14 |
      24. | 6 24aug2020 23sep2020 23sep2020 23sep2020 23sep2020 16 0 14 |
      |--------------------------------------------------------------------------------------------|
      25. | 7 12feb2022 12feb2022 12feb2022 12feb2022 17feb2022 5 0 0 |
      26. | 7 12feb2022 12feb2022 13feb2022 13feb2022 17feb2022 5 0 0 |
      27. | 7 12feb2022 12feb2022 14feb2022 14feb2022 17feb2022 5 0 0 |
      28. | 7 12feb2022 14feb2022 17feb2022 17feb2022 17feb2022 5 0 0 |
      +--------------------------------------------------------------------------------------------+
      Thank you Maarten! However one problem still exists. Consider ID #5. The correct LOS should be 15 days, not 14 days.

      Observation #12 = 1 day at the hospital
      Observation #13 = 2 additional days at the hospital (3rd of may already accounted for in obs #12)
      Observation #14 = 0 additional days at the hospital (4th of may already accounted for in obs #13)
      Observation #15 = 8 additional days at the hospital
      Observation #16 = 4 additional days at the hospital
      Sum = 15 days



      ID #5 underwent surgery 2nd of may (presumably day-care surgery), he was not admitted to the hospital until the day after surgery. The problem is that observation #14 is already accounted for in the previous observation (#13). Why this phenomena exists, I do not know, perhaps double-registrations or the patient was admitted for a intervention at another hospital but kept his/hers bed at the primary hospital thus was registered at two places at the same time.

      Comment


      • #4
        Ok I think I have a working code. It uses loops and expand, so it is tedious and will take a while to complete in the large dataset, but it works. I´ll like you code better Maarten but the problem with overlapping dates. If anyone has a faster idea, you are very welcome.


        Code:
        ** CREATE 30 OBSERVATIONS FOR EACH ID, TO BE LATER FILLED IF PATIENT ADMITTED IN HOSPITAL
        bysort ID (InDate OutDate): gen last=_n==_N
        gen DATE =.
        gen copy =.
        levelsof ID, local(AllID)
        foreach id of local AllID {
            expand 31 if ID== `id' & last==1, gen(copy`id')
            replace copy = 1 if copy`id'==1
            replace InDate =. if copy`id'==1
            replace OutDate =. if copy`id'==1
            drop copy`id'
        }
        drop last
        
        ** SEQUENTIALLY ADD DATES FOR EXPANDED OBSERVATIONS
        replace DATE = SurgeryDate 
        bysort ID (InDate OutDate): replace DATE = (DATE[_n-1])+1 if copy==1
        format DATE %td
        sort ID copy DATE InDate OutDate
        bysort ID (DATE InDate OutDate ): gen obs_ =_n 
        
        ** ADD VARIABLES THAT CORRESPOND TO InDate and OutDate
        su obs_ 
        local MAX = r(max)
        forval i = 1/`MAX'{
            gen InDate_`i'=.
            gen OutDate_`i'=.
            format InDate_`i' %td
            format OutDate_`i'  %td
        }
        
        ** TRANSFER InDate AND OutDate TO THE NEW VARIABLES
        levelsof ID, local (AllID)
        foreach id of local AllID {
            su obs_ if ID == `id'
            local max=r(max)
            forval j = 1/ `max' {
                replace InDate_`j' = InDate if obs_==`j'
                replace OutDate_`j' = OutDate if obs_==`j'
            }
        }
        
        ** CARRYFORWARD 
        bysort ID (obs_): carryforward InDate_* OutDate_*, replace
        gsort ID -obs_
        by ID: carryforward InDate_* OutDate_*, replace
        
        
        ** CREATE INDICATOR-DUMMY IF PATIENT ADMITTED IN HOSPITAL RESPECTIVE DAY
        sort ID obs_ InDate OutDate
        gen InHospDay =.
        order ID obs_ SurgeryDate InDate OutDate DATE copy InHospDay
        
        
        ** 1 IF PATIENT ADMITTED, 0 OTHERWISE FOR THE RESPECTIVE DATES
        su obs_ 
        local MAX = r(max)
        forval i = 1/`MAX'{
            replace InHospDay = 1 if inrange(DATE, InDate_`i' , OutDate_`i' ) &  InDate_`i'!=. &  OutDate_`i'!=.
        }
        
        ** DATE OF SURGERY IS NOT COUNTED
        replace InHospDay =. if DATE == SurgeryDate
        
        
        ** SUM ALL IN-HOSPITAL DAYS
        bysort ID (InDate OutDate): egen DAYSINHOSP = total (InHospDay)
        
        
        ** CLEAN UP
        su obs_ 
        local MAX = r(max)
        forval i = 1/`MAX'{
            drop  InDate_`i'
            drop OutDate_`i'
        }
        bysort ID: keep if _n==_N
        keep ID DAYSINHOSP

        Comment


        • #5
          What about this?

          Code:
          * Generate a level-2 sequence within each ID:
          bysort ID: generate data_line = _n
          
          * Calcualte total length within each line:
          generate temp_days = OutDate - InDate + 1
          
          * Expand the data per line
          expand temp_days
          gsort ID data_line
          
          * Within each expanded line (now it's a day a line),
          * Compute a mini-time series
          bysort ID data_line: generate mini_series = InDate + (_n - 1)
          format mini_series %tdNN-DD-CCYY
          
          * Drop duplicated dates within each patient
          duplicates drop ID mini_series, force
          
          * Keep only days that are w/i 30 days of surgery
          keep if (mini_series - SurgeryDate > 0) &  (mini_series - SurgeryDate < 30)
          
          * Sum up total number of unique days
          collapse (count) total_stay = data_line, by(ID)
          Results:
          Code:
               +---------------+
               | ID   total_~y |
               |---------------|
            1. |  1          2 |
            2. |  2          4 |
            3. |  3         21 |
            4. |  4         15 |
            5. |  5         15 |
            6. |  6         16 |
            7. |  7          5 |
               +---------------+
          Last edited by Ken Chui; 28 Feb 2025, 10:57.

          Comment


          • #6
            Here's a way to do it:

            Code:
            forval i = 1/30 {
                gen byte this_day = inrange(SurgeryDate + `i', InDate, OutDate)
                egen byte day_`i' = max(this_day), by(ID SurgeryDate)
                drop this_day
            }
            
            egen byte wanted = rowtotal(day_*)
            
            drop day_*
            and then you can
            Code:
            bysort ID SurgeryDate: keep if _n == 1
            list ID SurgeryDate wanted, noobs sep(0) abbrev(11)
            
              +---------------------------+
              | ID   SurgeryDate   wanted |
              |---------------------------|
              |  1     12dec2020        2 |
              |  2     13nov2021        4 |
              |  3     10jun2022       21 |
              |  4     20apr2020       15 |
              |  5     02may2019       15 |
              |  6     24aug2020       16 |
              |  7     12feb2022        5 |
              +---------------------------+
            Last edited by Hemanshu Kumar; 28 Feb 2025, 10:42.

            Comment


            • #7
              Thanks all, both Kens and Hemansus codes are magnitues faster than mine and produces the same results. Once again saved by Statalist, many thanks!

              Comment

              Working...
              X