Dear statalists,
I have number of excel files which have a structure such as this:
Column headers contain the initials of the professor which gave each particular lesson (column=lesson). They usually repeat because several lessons are given by each professor.
Each row is a student (unique identifier).
Each cell is a grade given to each student by the professor during each lesson.
So my aim is to end up with a dataset in a long format, such as this:
(... and so on)
I just can't figure out how to do this. The limitation is that if I import the excel file in the first place, it doesn't work because professors repeat themselves, so I end up with several unnamed variables.
Can someone please shed some light here? Do you think I should learn a different language for this kind of task?
Thank you in advance,
I have number of excel files which have a structure such as this:
student_id | GH | JG | GH | JO | JG |
1 | 9 | 14 | 13 | 17 | 18 |
2 | 5 | 4 | 13 | 14 | 18 |
3 | 12 | 15 | 17 | 14 | 15 |
Each row is a student (unique identifier).
Each cell is a grade given to each student by the professor during each lesson.
So my aim is to end up with a dataset in a long format, such as this:
student_id | prof | grade |
1 | GH | 9 |
1 | JG | 14 |
1 | GH | 13 |
1 | JO | 17 |
1 | JG | 18 |
2 | GH | 5 |
I just can't figure out how to do this. The limitation is that if I import the excel file in the first place, it doesn't work because professors repeat themselves, so I end up with several unnamed variables.
Can someone please shed some light here? Do you think I should learn a different language for this kind of task?
Thank you in advance,
Comment