I have been sent some data in an Excel spreadsheet that is formatted in an unfriendly way for importing to Stata. It uses merged cells to categorise repeating column names, so that a screenshot is, for once, probably the best way to show sample data:
data:image/s3,"s3://crabby-images/0be60/0be609f2beee6d7e17a1e0c7512732b14c05c9e7" alt="Click image for larger version
Name: Screen Shot 2022-04-02 at 10.06.41 am.png
Views: 5
Size: 71.3 KB
ID: 1657474"
The actual data has many more Groups and 10 variables within each group. My problem is how to keep the grouping and column information when importing each variable into Stata. When I import the data using the options
, Group B's variables become "F", "G" etc. so as not to duplicate the variable names from Group A.
Is there some clever way after the import to generate variable names A_Age, A_Sex, ..., B_Age, B_Sex,...etc? For reference, the .csv of the example data is:
The actual data has many more Groups and 10 variables within each group. My problem is how to keep the grouping and column information when importing each variable into Stata. When I import the data using the options
Code:
cellrange(A2) firstrow
Is there some clever way after the import to generate variable names A_Age, A_Sex, ..., B_Age, B_Sex,...etc? For reference, the .csv of the example data is:
Code:
Group A,,,,,Group B,,,,,Group C,,,, Age,Sex,Fed,Score,Total,Age,Sex,Fed,Score,Total,Age,Sex,Fed,Score,Total 10,M,AUS,5,5,5,F,AUS,5,4,8,M,AUS,2,2
Comment