Dear all,
I have a dataset of children and their schooling preferences from a school district, using which I am trying to do discrete choice modeling of school choices. There are about 25,000 children spread across 200 neighborhoods. I have the money to survey 600 children and want to do a two stage sampling (stage 1 of neighborhoods and stage 2 of children within neighborhoods). I have arrived at a design of 60 neighborhoods and 10 children per neighborhood. I want to do PPS in stage 1 based on the population of applicants in neighborhoods and stratified sampling in stage 2 (want to sample half poor and half non-poor applicants). Finally I want to have 600 sampled children and a buffer list from each neighborhood in case some of the sampled parents don't take part in the survey. I tried using gsample and samplepps commands, but they don't give me the buffer lists and stratification in stage 2. Example of my dataset is below (id-child identifier; nei_code- neighborhood code; available_schools- number of schools available in the neighborhood; population- number of applicants in the neighborhood; poor- indicator for poor applicant). Would really appreciate your help.
Many thanks,
Vijay
I have a dataset of children and their schooling preferences from a school district, using which I am trying to do discrete choice modeling of school choices. There are about 25,000 children spread across 200 neighborhoods. I have the money to survey 600 children and want to do a two stage sampling (stage 1 of neighborhoods and stage 2 of children within neighborhoods). I have arrived at a design of 60 neighborhoods and 10 children per neighborhood. I want to do PPS in stage 1 based on the population of applicants in neighborhoods and stratified sampling in stage 2 (want to sample half poor and half non-poor applicants). Finally I want to have 600 sampled children and a buffer list from each neighborhood in case some of the sampled parents don't take part in the survey. I tried using gsample and samplepps commands, but they don't give me the buffer lists and stratification in stage 2. Example of my dataset is below (id-child identifier; nei_code- neighborhood code; available_schools- number of schools available in the neighborhood; population- number of applicants in the neighborhood; poor- indicator for poor applicant). Would really appreciate your help.
Many thanks,
Vijay
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long(id nei_code) float(available_schools population poor) 8805 292001046 4 63 0 34582 292001046 4 63 1 6759 292001046 4 63 1 204042 292001129 28 374 1 148069 292001129 28 374 1 213387 292001129 28 374 1 28516 292001129 28 374 1 169220 292001129 28 374 0 39276 292001129 28 374 1 282710 292001129 28 374 0 26063 292001129 28 374 1 250315 292001129 28 374 0 277483 292001129 28 374 1 286421 292001129 28 374 1 214859 292001129 28 374 1 286817 292001129 28 374 0 11116 292001130 30 316 1 259525 292001130 30 316 1 193110 292001130 30 316 1 26748 292001130 30 316 1 end
Comment