How can I keep the earliest admission and latest discharge date for a hospitalisation record?

Sophia Med

Join Date: May 2023

Posts: 8
#1

How can I keep the earliest admission and latest discharge date for a hospitalisation record?

11 Mar 2024, 12:17

Hi, I have various individuals with a unique identifier (id) in a dataset with hospitalisation records. Some individuals have multiple hospitalisations; however, because of the nature of the health system, some individuals have the same admission date (date_admit) and different discharge date (date_disch) due to billing periods/records.

There are multiple cases of this but using id 3261 as an example (dataex below), individual 3261 is hospitalised on 14jun2012 with 5 records of this with 5 different discharge dates. I would like to change this so that individual 3261’s 14jun2012 admission record is a single one with the admission date as 14june2012 and the discharge date as the latest date of 10oct2012. The aim is to do this so that the billing periods do not affect the true admission and discharge dates. I would still like to keep all the other hospital records for id 3261.

I’m attempting to use dataex for the first time in my question so I hope this is correct and helps explain my data.

I use stata 15 for reference.

Thank you in advance!

My data looks like this:

Code:

* Example generated by -dataex-. To install: ssc install dataex * dataex id date_admit date_disch diag_p if id==3261 clear input long id int(date_admit date_disch) str4 diag_p 3261 19145 19158 "F29" 3261 19158 19174 "F29" 3261 19158 19205 "F29" 3261 19158 19236 "F29" 3261 19158 19266 "F29" 3261 19158 19276 "F29" 3261 20022 20026 "F29" 3261 20026 20057 "F200" 3261 20026 20060 "F200" 3261 20256 20265 "F29" 3261 20265 20275 "F200" end format %td date_admit format %td date_disch
Tags: None

Bruce Weaver

Join Date: May 2014
Posts: 1132

11 Mar 2024, 12:54

Hello Sophia Med. I'm not sure I completely understand what you want to do. My first thought was that you wanted to flag the records with the 1st and last admissions for a given admission date (within each unique ID). Something like this:

Code:

. sort id date_admit date_disch

. by id date_admit: generate byte firstadm = _n==1

. by id date_admit: generate byte lastdisch = _n==_N

. list, clean noobs

      id   date_ad~t   date_di~h   diag_p   firstadm   lastdi~h  
    3261   01jun2012   14jun2012      F29          1          1  
    3261   14jun2012   30jun2012      F29          1          0  
    3261   14jun2012   31jul2012      F29          0          0  
    3261   14jun2012   31aug2012      F29          0          0  
    3261   14jun2012   30sep2012      F29          0          0  
    3261   14jun2012   10oct2012      F29          0          1  
    3261   26oct2014   30oct2014      F29          1          1  
    3261   30oct2014   30nov2014     F200          1          0  
    3261   30oct2014   03dec2014     F200          0          1  
    3261   17jun2015   26jun2015      F29          1          1  
    3261   26jun2015   06jul2015     F200          1          1

But then I wondered if you really wanted to use -collapse-, like this:

Code:

. generate byte recs = 1

. collapse (last) date_disch (first) diag_p (count) recs, ///
> by(id date_admit)

. list, clean noobs

      id   date_ad~t   date_di~h   diag_p   recs  
    3261   01jun2012   14jun2012      F29      1  
    3261   14jun2012   10oct2012      F29      5  
    3261   26oct2014   30oct2014      F29      1  
    3261   30oct2014   03dec2014     F200      2  
    3261   17jun2015   26jun2015      F29      1  
    3261   26jun2015   06jul2015     F200      1

I hope one of these approaches is what you are looking for, or at least that they will help moved things along.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#3

11 Mar 2024, 13:41

The question in #1 is a bit hard to understand. In one place it speaks of producing a single record, and elsewhere it speaks of retaining all the original records. Confusing! So, I interpreted it differently from either of the interpretations in #2. I think that O.P. wants to replace all of the admission and discharge dates by the earliest and latest dates, respectively, in each observation for any given patient. So that would be:

Code:

by id date_admit (date_disch), sort: replace date_disch = date_disch[_N]

Any of the three solutions proposed here or in #2 could be a sensible thing to do, depending on where O.P. is going with this. So it's up to her to choose which one suits her purpose.
Comment
George Hoffman

Join Date: Mar 2014

Posts: 73
#4

11 Mar 2024, 14:08

i think there's a potential additional level of complexity. if the unique identifier is a hospitalization-level identifier, then the above approaches witll work. if the unique identifier, is patient-level, then more extensive logic is required. i suspect the OP would consider hospitalizations to be separate if
1 - its start was after the latest previous start, and
2 - start was after the last end from the latest previous start.

the above condtions might be combined to create a unique-hospitalization identifier; then the previous solutions might work...?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#5

11 Mar 2024, 16:12

Yes, #4 proposes yet another possible interpretation, although O.P. does seem to characterize the observations of a single admission as all having the same start date. Nevertheless, in case that's not true and one can truly have admissions nested within admissions:

Code:

gen `c(obs_t)' obs_no = _n reshape long date_, i(obs_no) j(event) string gsort id date_ -event by id (date): gen depth = sum((event == "admit") - (event == "disch")) by id (date): gen admit_num = sum(depth == 1 & inlist(depth[_n-1], 0, .)) gsort id admit_num date_ -event by id admit_num: gen start_date = date[1] by id admit_num: gen end_date = date[_N] format *_date %td drop depth reshape wide

This code is flexible and accommodates admissions that are nested in other admissions, or overlap without necessarily being coterminous at either end. At that point, all of the calculations of start and end dates are done. But then it remains to re-organize the data along the lines of the responses in #2 or #3 as desired. That largely reiterates the code shown in #2 and #3, applied to this revised data set and modified to work by admit_num rather than by admit_date.

Note: There is one edge case that is ambiguous. If somebody is discharged on a given day and then re-admitted on the same day, without a timestamp it is impossible to know if this constitutes two separate admissions or if a single admission is continuing. Both of these situations can arise in real life. This code treats this situation as two separate admissions (as instantiated in observations 1 and 2 of the example data in #1.)
Comment
Sophia Med

Join Date: May 2023

Posts: 8
#6

12 Mar 2024, 05:38

Hi All,

Thank you so much for the responses. Apologies for the lack of clarity. #2a was the closest to what I was thinking, though all solutions offered some insight into how to go about this and have helped me with the code overall. Thank you again!
1 like
Comment

Announcement