Hi all,
Below is my code, there are a few problems with it. I would appreciate any advice! I am new to STATA (this is my first project!)
Overall, I am trying to show hospital arrivals (num_edtc) and discharges (num_dcord) during each hour of a physician's shift. Each physician is represented by phys_name and the shift start/end date/time is num_shift_start1 and num_shift_end1, respectively. Here is what I am struggling with:
Below is my code, there are a few problems with it. I would appreciate any advice! I am new to STATA (this is my first project!)
Overall, I am trying to show hospital arrivals (num_edtc) and discharges (num_dcord) during each hour of a physician's shift. Each physician is represented by phys_name and the shift start/end date/time is num_shift_start1 and num_shift_end1, respectively. Here is what I am struggling with:
- The code is counting the number of hours within a shift and expanding the dataset by those hours, however the start/end time for their shift is not being broken down within that. Ex: A physician works for 9 hours, there are now 9 observations for an initial 1. The start and end times are the same as the initial observation, rather than counting by an hour each.
- I have tried adding num_shift_start1 + 3,600,000 * (admit_id - 1)
- I think I could create a loop where I add 3,600,000 each time and loop for the number of hours, I'm not sure how to update the start/end times in the dataset with this. Is there a less clunky way?
- Towards the end, it is showing the sum of all patients in the hospital, rather than the patients who were assigned to a particular physician. I tried to resolve this via bysort. If my problem above is solved, I think I can run a conditional to check for a patient and create a dummy variable.
- If there is a simpler way to approach this, I am open to suggestions!
Code:
import excel "/Users/amandabuechele/Desktop/RA Test 2022/#3.xls", sheet("#3") firstrow allstring ***Combine shift start/end date/time **# Bookmark #1 generate shift_start = shiftstart_date + " " + shiftstart_time gen shiftend_date = shfitend_date generate shift_end = shiftend_date + " " + shiftend_time ***Convert string date/time variables to numeric generate double num_edtc = clock(edtc, "MDYhm") generate double num_dcord = clock(dcord_tc, "MDYhm") gen double num_shift_start1 = clock(shift_start, "DM19Yhms") gen double num_shift_end1 = clock(shift_end, "DM19Yhms") format %tcMonth_dd,_CCYY_HH:MM:SS num_edtc format %tcMonth_dd,_CCYY_HH:MM:SS num_dcord format %tcMonth_dd,_CCYY_HH:MM:SS num_shift_start1 format %tcMonth_dd,_CCYY_HH:MM:SS num_shift_end1 ***Create admission ID gen admit_id = _n ***Create hourly dataset PROBLEM IS COUNTING HOURS AND EXPANDING BY HOURS BUT NOT REPLACING START/END gen hours = cond(hh(num_shift_start1) < hh(num_shift_end1), hh(num_shift_end1) - hh(num_shift_start1), hh(24 - (num_shift_start1)) + hh(num_shift_end1)) expand hours sort phys_name num_shift_start1 num_shift_end1 *** PROBLEM IS NOT BREAKING DOWN SHIFT HOUR BY HOUR bysort phys_name admit_id (num_shift_start1 num_shift_end1): gen double start = num_shift_start1 format %tcMonth_dd,_CCYY_HH:MM:SS start bysort phys_name admit_id (num_shift_start1 num_shift_end1): gen double end = num_shift_end1 format %tcMonth_dd,_CCYY_HH:MM:SS end format start end %tc drop num_shift_start1 num_shift_end1 ***Check if patient is in hospital bysort phys_name: gen in_hospital = (num_edtc <= start & num_dcord <= end) | (num_edtc >= start & num_edtc <= end) | (num_edtc <= start & num_dcord >= end) ***Sum patients per hour in hospital PROBLEM IS SUM OF ALL PATIENTS IN HOSPITAL NOT BY PHYSICIAN collapse (sum) in_hospital, by(phys_name start end) list, sepby(phys_name)
Comment