Hello everyone.
I need your help with a topic I am working on. I currently have a database that contains information about higher education in Ecuador. Specifically, this data contains the graduates in higher education. Some variables indicate the identifier of each person, the university where he/she studied, the career he/she studied and the graduation dates, among other variables about the person's characteristics or the university. I need to convert this data to a horizontal format for data processing purposes, where each observation is a single person. However, the original database has more than one observation for each person because there are cases of individuals with more than one university degree recorded. There are also cases where their postgraduate or doctorate degrees are recorded and are reflected as one more observation. My question is the following:
Is there any way to convert to a horizontal format all this data without losing information on each individual?
I have tried the "reshape wide" command, however it is only useful when I have a single observation per individual. Any suggestions would be really helpful and I would appreciate it. I am attaching an example of the database, as it is confidential information I am not allowed to sample the data. Regards
I need your help with a topic I am working on. I currently have a database that contains information about higher education in Ecuador. Specifically, this data contains the graduates in higher education. Some variables indicate the identifier of each person, the university where he/she studied, the career he/she studied and the graduation dates, among other variables about the person's characteristics or the university. I need to convert this data to a horizontal format for data processing purposes, where each observation is a single person. However, the original database has more than one observation for each person because there are cases of individuals with more than one university degree recorded. There are also cases where their postgraduate or doctorate degrees are recorded and are reflected as one more observation. My question is the following:
Is there any way to convert to a horizontal format all this data without losing information on each individual?
I have tried the "reshape wide" command, however it is only useful when I have a single observation per individual. Any suggestions would be really helpful and I would appreciate it. I am attaching an example of the database, as it is confidential information I am not allowed to sample the data. Regards
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(id_number num_observations) str8 career_name str12 university_name float(graduation_date degree_note) str7 level_education 1 5 "Career_2" "University_0" 744 3 "level_1" 1 5 "Career_9" "University_6" 751 3 "level_2" 1 5 "Career_9" "University_1" 753 1 "level_1" 1 5 "Career_8" "University_7" 728 3 "level_2" 1 5 "Career_8" "University_5" 750 2 "level_2" 2 5 "Career_7" "University_5" 755 1 "level_1" 2 5 "Career_8" "University_9" 726 3 "level_2" 2 5 "Career_6" "University_9" 730 1 "level_3" 2 5 "Career_0" "University_7" 746 4 "level_2" 2 5 "Career_0" "University_3" 751 5 "level_1" 3 5 "Career_2" "University_0" 753 4 "level_2" 3 5 "Career_7" "University_2" 739 3 "level_1" 3 5 "Career_0" "University_0" 724 1 "level_2" 3 5 "Career_2" "University_8" 727 4 "level_2" 3 5 "Career_1" "University_2" 734 0 "level_2" 4 3 "Career_9" "University_8" 730 1 "level_3" 4 3 "Career_5" "University_4" 735 4 "level_2" 4 3 "Career_6" "University_1" 732 3 "level_1" 5 3 "Career_0" "University_7" 753 5 "level_1" 5 3 "Career_9" "University_9" 730 0 "level_2" end
Comment