Hello,
I have a .xlsx file that I want to import into Stata. The file has 50 sheets for 50 U.S. states. The first column for all the 50 sheets is a list of variables and the other columns (after dropping some rows like the current # 2, 5, 6, and 7) are the annual values for the variables. Please find the attached screenshot.
data:image/s3,"s3://crabby-images/5cc51/5cc5166239146a8f55f3c52ff3908f186269465f" alt=""
The annual data available varies at the state level. Some states have data from 1949 onward while others start from 1952. There are also missing years for some.
What I want to do:
Thanks so much!
I have a .xlsx file that I want to import into Stata. The file has 50 sheets for 50 U.S. states. The first column for all the 50 sheets is a list of variables and the other columns (after dropping some rows like the current # 2, 5, 6, and 7) are the annual values for the variables. Please find the attached screenshot.
The annual data available varies at the state level. Some states have data from 1949 onward while others start from 1952. There are also missing years for some.
What I want to do:
- Import all 50 sheets into Stata, maybe as tempfiles.
- Make current row # 4 as variable names representing each year, for example, as _1949, _1950, _1951, etc.
- Reshape from wide to long. Before that, I will replace the values in row # 1 to fit as variable names.
- Merge the 50 temp/files.
- Is there a way to create a loop such that I can import all the sheets without having to type the sheet names in the loop heading? Is there a way to automatically capture the sheet names? I will be doing this to multiple Excel files, some of which may not have all the U.S. states, so having a way to capture only the available sheets will be helpful.
- As said, for each state/sheet, within the loop, I want to set the values in row # 4 as variable names (as _1949, _1950, _1951, etc.). For this, I found the following loop on this website. But with varying years for many states and with missing years for some states, how can I capture only the available column names and automate it without having to specify the column names?
Code:
foreach j in B-BX { local try = strtoname(`j'[1]) capture rename _var`j' `try' }
Comment