Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to expand observations for a duration of time

    Hi all,

    Below is my code, there are a few problems with it. I would appreciate any advice! I am new to STATA (this is my first project!)

    Overall, I am trying to show hospital arrivals (num_edtc) and discharges (num_dcord) during each hour of a physician's shift. Each physician is represented by phys_name and the shift start/end date/time is num_shift_start1 and num_shift_end1, respectively. Here is what I am struggling with:
    1. The code is counting the number of hours within a shift and expanding the dataset by those hours, however the start/end time for their shift is not being broken down within that. Ex: A physician works for 9 hours, there are now 9 observations for an initial 1. The start and end times are the same as the initial observation, rather than counting by an hour each.
      1. I have tried adding num_shift_start1 + 3,600,000 * (admit_id - 1)
      2. I think I could create a loop where I add 3,600,000 each time and loop for the number of hours, I'm not sure how to update the start/end times in the dataset with this. Is there a less clunky way?
    2. Towards the end, it is showing the sum of all patients in the hospital, rather than the patients who were assigned to a particular physician. I tried to resolve this via bysort. If my problem above is solved, I think I can run a conditional to check for a patient and create a dummy variable.
    3. If there is a simpler way to approach this, I am open to suggestions!

    Code:
    import excel "/Users/amandabuechele/Desktop/RA Test 2022/#3.xls", sheet("#3") firstrow allstring
    
    ***Combine shift start/end date/time
    **# Bookmark #1
    generate shift_start = shiftstart_date + " " + shiftstart_time
    gen shiftend_date = shfitend_date
    generate shift_end = shiftend_date + " " + shiftend_time
    
    ***Convert string date/time variables to numeric
    generate double num_edtc = clock(edtc, "MDYhm")
    generate double num_dcord = clock(dcord_tc, "MDYhm")
    gen double num_shift_start1 = clock(shift_start, "DM19Yhms")
    gen double num_shift_end1 = clock(shift_end, "DM19Yhms")
    
    format %tcMonth_dd,_CCYY_HH:MM:SS num_edtc
    format %tcMonth_dd,_CCYY_HH:MM:SS num_dcord
    format %tcMonth_dd,_CCYY_HH:MM:SS num_shift_start1
    format %tcMonth_dd,_CCYY_HH:MM:SS num_shift_end1
    
    
    
    ***Create admission ID
    gen admit_id = _n
    
    ***Create hourly dataset PROBLEM IS COUNTING HOURS AND EXPANDING BY HOURS BUT NOT REPLACING START/END
    gen hours = cond(hh(num_shift_start1) < hh(num_shift_end1), hh(num_shift_end1) - hh(num_shift_start1), hh(24 - (num_shift_start1)) +  hh(num_shift_end1))
    expand hours
    sort phys_name num_shift_start1 num_shift_end1
    
    
    *** PROBLEM IS NOT BREAKING DOWN SHIFT HOUR BY HOUR
    bysort phys_name admit_id (num_shift_start1 num_shift_end1): gen double start = num_shift_start1  
    format %tcMonth_dd,_CCYY_HH:MM:SS start
    bysort phys_name admit_id (num_shift_start1 num_shift_end1): gen double end = num_shift_end1
    format %tcMonth_dd,_CCYY_HH:MM:SS end
    format start end %tc
    drop num_shift_start1 num_shift_end1 
    
    
    ***Check if patient is in hospital 
    bysort phys_name: gen in_hospital = (num_edtc <= start & num_dcord <= end) | (num_edtc >= start & num_edtc <= end) | (num_edtc <= start & num_dcord >= end)
    
    ***Sum patients per hour in hospital PROBLEM IS SUM OF ALL PATIENTS IN HOSPITAL NOT BY PHYSICIAN
    collapse (sum) in_hospital, by(phys_name start end)
    list, sepby(phys_name)

  • #2
    This is a complex post with no data examples to be able to replicate.

    For your problem # 1 this should give you an idea you can apply to your code:

    Code:
    clear
    set obs 1
    g id=1
    g initial=1
    g nperiods=5
    
    expand nperiods
    bys id: g period=initial+_n-1

    Comment

    Working...
    X