I am trying to import and append multiple .csv files - and add the individual file names as an ID variable. The code below was taken from various Google searches (apologies, I can't remember the exact source). The code runs and imports/appends the files correctly, and works for files with the first row as variable names. However, the first 13 rows of the .csv files I have are unwanted information, followed by the 14th row which is the variable names - and therefore the variable names do not get identified and all data is imported as string.
I have come unstuck trying to add in any additional commands to ignore the first X-number of rows of each file (13 rows in this case) - and also use the X-row as the variable names (14th row in this case).
It is possible to just drop these unwanted rows, rename the variables, and destring - but I'd like to see if there is a 'proper' way to do this.
Notes:
- The Excel .csv format I have uses a semi-colon (";") delimiter.
- I am using Stata v17.0.
Many thanks
I have come unstuck trying to add in any additional commands to ignore the first X-number of rows of each file (13 rows in this case) - and also use the X-row as the variable names (14th row in this case).
It is possible to just drop these unwanted rows, rename the variables, and destring - but I'd like to see if there is a 'proper' way to do this.
Notes:
- The Excel .csv format I have uses a semi-colon (";") delimiter.
- I am using Stata v17.0.
Code:
cd "C:\csv_files_to_append" local filepath = "`c(C:\csv_files_to_append)'" local files : dir "`filepath'" files "*.csv" tempfile name save `name', replace empty foreach x of local files { qui: import delimited "`x'", delimiter(";") case(preserve) clear qui: gen id = subinstr("`x'", ".csv", "", .) append using `name' save `name', replace }
Comment