I inherited the following code for a project and am trying to make it more efficient. There are 8 schools so both chunks of code below are repeated for the data for each of the 8 schools in the original do file. The excel files all have the data in the same sheet (Sheet1) and have the same columns, data format, etc. The goal is to import the data from all 8 schools and join them into a single dta file.
Ideally, I would like to accomplish both of these tasks in a single for loop. I would settle for doing each in a separate loop. To start out, I have created loop just for the first task of importing from the Excel files and saving into dta files. (I've shortened the directory path here but in my actual code, I use the full correct path.) The following works as expected. (Note: I edited my original post, I was having issues with this loop but figured it out.)
I am not sure what to do next to adjust this so that I can simply import all of the data and append in a single loop. I am fine with skipping the step of saving each imported excel file into its own dta file, if that step can be skipped. (If you provide code in your answer and can also include an explanation of why/how it works, I would be very appreciative, because I find Stata a little confusing, as someone who previously only used R.)
Code:
import excel "Data/Raw Data/Academic/[school name1].xlsx", sheet("Sheet1") firstrow case(lower) clear drop if studentfirstname=="" & studentlastname=="" save "Data/Working Data/Academic/Reading/[school name1].dta", replace use "Data/Working Data/Academic/[school name1].dta", clear append using "Data/Working Data/Academic/[school name2].dta"
Code:
local path ".../Data/Raw Data/Academic" local files: dir "`path'" files "*.xlsx" foreach file in `files' { clear import excel using "`path'/`file'", sheet("Sheet1") firstrow case(lower) drop if studentfirstname=="" & studentlastname=="" save "`path'/`file'.dta", replace }
Comment