Hi all,
Background
I am trying to run a for loop to automate the following process. I have panel data on university applicants, and I'm trying to determine the following information for each of the individual universities:
Data Explanation
Prospective students can apply to up to 10 schools, with their first choice being stored in the variable UniversityChoice1 and their tenth choice being stored in UniversityChoice10. I have data on whether or not they received an offer from all of the schools they applied to, with whether or not they received an offer to their first choice stored in the variable OfferToChoice1 and whether or not they received an offer to their tenth choice stored in the variable OfferToChoice10. I also have data on which university's offer they accepted, stored in the variable RegisteredUniversity.
(UniversityACode) denotes the school code for University A in the UniversityChoiceX variable. So, if UniversityChoice1 == (UniversityACode), it means that that University A was the applicants first choice of school.
I've run the code through with several schools and I am 100% sure that it returns the proper values. The code reads as follows:
My Question
Given that there are a huge number of schools, how can I automate this process with a for loop?
I've tried to use levelsof with the RegisteredUniversity variable to create a local macro containing all of the different universities, but I've had no luck storing them by label instead of values and then using foreach to replicate the first chunk of code (*Tabulate Applicants To University A). My attempt went as follows:
I understand this is a lengthy question, but any help or guidance would be much appreciated. I am happy to provide any clarity if needed.
Cheers,
Kevin
Background
I am trying to run a for loop to automate the following process. I have panel data on university applicants, and I'm trying to determine the following information for each of the individual universities:
- Number of applicants to each individual university
- Number of applicants that received an offer to each individual university
- Each university's offer rate (number of applicants that received offers/number of applicants that applied)
- The acceptance rate of students that receive an offer to each university (number of applicants that attended said university/number of applicants that received offers to said university)
Data Explanation
Prospective students can apply to up to 10 schools, with their first choice being stored in the variable UniversityChoice1 and their tenth choice being stored in UniversityChoice10. I have data on whether or not they received an offer from all of the schools they applied to, with whether or not they received an offer to their first choice stored in the variable OfferToChoice1 and whether or not they received an offer to their tenth choice stored in the variable OfferToChoice10. I also have data on which university's offer they accepted, stored in the variable RegisteredUniversity.
(UniversityACode) denotes the school code for University A in the UniversityChoiceX variable. So, if UniversityChoice1 == (UniversityACode), it means that that University A was the applicants first choice of school.
I've run the code through with several schools and I am 100% sure that it returns the proper values. The code reads as follows:
Code:
*Tabulate Applicants To University A gen AppliedToUniversityA = 1 if UniversityChoice1 == (UniversityACode) | UniversityChoice2 == (UniversityACode) | UniversityChoice3 == (UniversityACode) | UniversityChoice4 == (UniversityACode) | UniversityChoice5 == (UniversityACode) | UniversityChoice6 == (UniversityACode) | UniversityChoice7 == (UniversityACode) | UniversityChoice8 == (UniversityACode) | UniversityChoice9 == (UniversityACode) | UniversityChoice10 == (UniversityACode) recode AppliedToUniversityA (.=0) tab AppliedToUniversityA //we observe that 400,000 people applied to university A *Tabulate Those That Received An Offer To University A gen OfferToUniversityA = 1 if UniversityChoice1 == (UniversityACode) & OfferToChoice1 == 1 | UniversityChoice2 == (UniversityACode) & OfferToChoice2 == 1 | UniversityChoice3 == (UniversityACode) & OfferToChoice3 == 1 | UniversityChoice4 == (UniversityACode) & OfferToChoice4 == 1 | UniversityChoice5 == (UniversityACode) & OfferToChoice5 == 1 | UniversityChoice6 == (UniversityACode) & OfferToChoice6 == 1 | UniversityChoice7 == (UniversityACode) & OfferToChoice1 == 7 | UniversityChoice8 == (UniversityACode) & OfferToChoice8 == 1 | UniversityChoice9 == (UniversityACode) & OfferToChoice9 == 1 | UniversityChoice10 == (UniversityACode) & OfferToChoice10 == 1 recode OfferToUniversityA (.=0) tab OfferToUniversityA //we observe that 100,000 people received an offer to university A *Tabulate Offer Rate Of University A gen UniversityAOfferRate = (OfferToUniversityA/AppliedToUniversityA) tab UniversityAOfferRate //we observe an offer rate of 25% *Tabulate Acceptance Rate Of Those Who Receive An Offer gen WentToUniversityA = 1 if RegisteredUniversity = (UniversityACode) recode WentToUniversityA (.=0) gen UniversityAAcceptanceRate = (WentToUniversityA/OfferToUniversityA) tab UniversityAAcceptanceRate //we observe an acceptance rate of x%
Given that there are a huge number of schools, how can I automate this process with a for loop?
I've tried to use levelsof with the RegisteredUniversity variable to create a local macro containing all of the different universities, but I've had no luck storing them by label instead of values and then using foreach to replicate the first chunk of code (*Tabulate Applicants To University A). My attempt went as follows:
Code:
levelsof RegisteredUniversity, local(university) local lbe : value label RegisteredUniversity foreach 1 of local university { `f1' : label `lbe' `1' } di `f1' foreach i in 1-10 { gen AppliedTo`university'A = 1 if UniversityChoice`i' == 1 }
Cheers,
Kevin
Comment