Hi. My code below creates one single dataset by appending sheets from four excel workbooks where each has 31 sheets. When I check the number of entries (tab hospital below) in the final dataset, I notice some anomalies. The KAR workbook has 29,117 entries, but the final dataset has only 29,099 observations. How can I:
1. identify what is happening here, i.e., why are some excel cells not carrying over as observations in stata?
2. which excel cells are not carrying over as observations in stata?
3. for the one observation that has no hospital listed against it, is there anyway for me to identify in which excel sheet this observation exists?
Any help will be very much appreciated
1. identify what is happening here, i.e., why are some excel cells not carrying over as observations in stata?
2. which excel cells are not carrying over as observations in stata?
3. for the one observation that has no hospital listed against it, is there anyway for me to identify in which excel sheet this observation exists?
Any help will be very much appreciated
Code:
tab hospital Hospital | Freq. Percent Cum. ------------+----------------------------------- | 1 0.00 0.00 BBI | 4,436 9.47 9.47 GMRV | 4,282 9.14 18.61 KAR | 29,099 62.11 80.72 KVC | 9,035 19.28 100.00 ------------+----------------------------------- Total | 46,853 100.00 _____________code clear tempfile building save `building', emptyok foreach f in "KAR" "GMRV" "MTC" "KVC" { import excel using "${db}\raw data\OC_Jan22_Jul24_`f'.xlsx", describe local n_sheets `r(N_worksheet)' *return list local n_sheets `r(N_worksheet)' forvalues j = 1/`n_sheets' { local sheet`j' `r(worksheet_`j')' } forvalues j = 1/`n_sheets' { import excel using "${db}\raw data\OC_Jan22_Jul24_`f'.xlsx", sheet(`"`sheet`j''"') firstrow case(lower) clear gen sheet = `j' append using `building' save `"`building'"', replace } sort sheet *drop DoctorWorkup Workup Dilation PostDilation ReadyforConsultant *drop if mrno=="KAR" *generating an age and sex var from the combined var rename age Age split Age, gen(age) parse("/") replace age1 = "1" if ustrregexm(age1, "[0-9]{1,2} Mo.") destring age1, replace rename age1 age rename age2 sex *drop Age *splitting combined date-time var gen double appt_dttm = clock(appttime, "DMYhm") *assert missing(appt_dttm) == missing(appttime) format appt_dttm %tc gen appt_date = dofc(appt_dttm) format appt_date %td gen double appt_time_of_day = appt_dttm - cofd(appt_date) format appt_time_of_day %tcHH:MM *month and year from date gen year = year(appt_date) gen month = month(appt_date) *replace missing values with . rename district District rename state State encode District, gen(district) encode State, gen(state) drop District State appt_dttm save "${db}\clean data\emr_`f'.dta", replace clear } ___________sample data input str4 hospital str16 mrno str55 location str4 department str9 age str1 sex float appt_date double appt_time_of_day float(year month) long(district state) "KAR" "P1331797" "" "OCU" "22" "F" 22660 28800000 2022 1 366 21 "KAR" "P983268" "" "UVA" "44" "M" 22669 29700000 2022 1 459 32 "KAR" "P1354760" "" "PED" "6" "M" 22672 35400000 2022 1 235 29 "KAR" "P1363931" "" "COR" "69" "M" 22648 47700000 2022 1 459 32 "BBI" "MTC-P324302" "Madanpur" "PED" "59" "F" 22665 52500000 2022 1 395 26 "BBI" "MTC-P306148" "Ghantua" "IVR" "3" "F" 22670 43200000 2022 1 44 26 "KVC" "KVC-P117086" "Modekurru" "RET" "49" "F" 22662 35100000 2022 1 170 2 "KVC" "CHP-NP29556" "Kunchanapalle" "OCU" "58" "F" 22646 37800000 2022 1 206 2 "KVC" "KVC-P203284" "Guntur" "OCU" "45" "M" 22667 41400000 2022 1 206 2 "GMRV" "GMRV-N44504" "Ravikamatham" "COR" "58" "F" 22674 36000000 2022 1 570 2 "KVC" "KVC-NP67821" "Mudinepalli" "RET" "68" "M" 22655 31500000 2022 1 321 2 "BBI" "MTC-N57088" "Alatara" "RET" "53" "M" 22669 33300000 2022 1 271 26 "KAR" "KVC-P202085" "" "COR" "52" "F" 22650 53100000 2022 1 321 2 "KAR" "N440996" "" "GLA" "9" "M" 22657 29700000 2022 1 509 21 "KAR" "P1007960" "" "GLA" "75" "M" 22672 38700000 2022 1 321 2 "KAR" "P1294321" "" "GLA" "77" "M" 22657 34200000 2022 1 573 32 "KAR" "P1389977" "" "COR" "48" "F" 22649 38100000 2022 1 . . "KAR" "N336487" "" "RET" "23" "M" 22646 36900000 2022 1 285 32 "KVC" "KVC-P202297" "Mangalagiri" "COR" "60" "M" 22670 43200000 2022 1 206 2 "KVC" "KVC-NP67609" "Krishnalanka" "OCU" "54" "F" 22660 39600000 2022 1 321 2 "KAR" "PN1376368" "" "COR" "21" "F" 22656 34200000 2022 1 573 32 "GMRV" "GMRV-P39106" "Rajahmundry" "PED" "21" "M" 22662 45600000 2022 1 170 2 "BBI" "MTC-P264925" "Raipur" "COR" "62" "F" 22662 55800000 2022 1 78 26 "KAR" "P1392029" "" "COR" "57" "M" 22666 35100000 2022 1 127 36 "KAR" "P1036367" "" "GLA" "52" "F" 22658 34200000 2022 1 170 2 "KAR" "P1305697" "" "PED" "8" "F" 22649 55200000 2022 1 321 2 "KAR" "P1391661" "" "RET" "45" "F" 22655 41400000 2022 1 . . "KAR" "P1364004" "" "OCU" "16" "F" 22646 31500000 2022 1 . . "KVC" "KVC-N68172" "rachakonda" "COR" "51" "M" 22673 32100000 2022 1 436 2 "KAR" "P1272562" "" "GLA" "57" "F" 22650 33300000 2022 1 559 36 "KVC" "KND-P27626" "Kanakanampadu" "OCU" "59" "M" 22663 48600000 2022 1 436 2 "KVC" "KND-P14774" "Gangapalem" "RET" "67" "M" 22653 35100000 2022 1 436 2 "KAR" "N357867" "" "COR" "53" "M" 22665 51000000 2022 1 383 32 "KAR" "P1389916" "" "RET" "19" "M" 22663 35100000 2022 1 38 18 "KAR" "P1050999" "" "GLA" "80" "M" 22676 33300000 2022 1 226 32 "KAR" "P1227482" "" "COR" "50" "M" 22663 32400000 2022 1 231 20 "BBI" "MTC-P289397" "Mancheswar" "IVR" "3" "M" 22674 42300000 2022 1 302 26 "BBI" "MTC-P323939" "Kolkata" "RET" "49" "F" 22651 42900000 2022 1 . . "KAR" "P1278372" "" "PED" "30" "M" 22672 33300000 2022 1 383 32 "BBI" "MTC-P318686" "Kaligadu" "OCU" "8" "M" 22651 36000000 2022 1 276 26 "KAR" "P1392119" "" "COR" "71" "F" 22657 52200000 2022 1 226 32
Comment