Dear Statalisters,
I have a school admissions dataset with about 15,000 observations. I want to draw a sample of 2000 (1000 treatments and 1000 controls) children and conduct a primary survey on them. Children are admitted to school based on a very complex lottery which is held at the neighborhood level. The only way a parent can improve her chances of admission is by applying to a large number of schools (within the neighborhood) and to the right (schools where the probability of admission is high) schools- the rest is random. Given the complex lottery, I am not able to identify groups whose ex-ante probability of admission is equal. I am therefore trying to do propensity score matching to create treatment and control groups that are comparable. I calculated the propensity scores of admission (being treated) for each observation using the application profile (schools chosen and the preference order). My sample has 60 percent treatments and 40 percent controls. I want to do the following:
1. randomly pick 1000 treatments from all the treatments
2. Match each of the treatment (using the propensity score) with a control from the same neighborhood
Here is how my data looks like
Would be grateful for any advice!
I have a school admissions dataset with about 15,000 observations. I want to draw a sample of 2000 (1000 treatments and 1000 controls) children and conduct a primary survey on them. Children are admitted to school based on a very complex lottery which is held at the neighborhood level. The only way a parent can improve her chances of admission is by applying to a large number of schools (within the neighborhood) and to the right (schools where the probability of admission is high) schools- the rest is random. Given the complex lottery, I am not able to identify groups whose ex-ante probability of admission is equal. I am therefore trying to do propensity score matching to create treatment and control groups that are comparable. I calculated the propensity scores of admission (being treated) for each observation using the application profile (schools chosen and the preference order). My sample has 60 percent treatments and 40 percent controls. I want to do the following:
1. randomly pick 1000 treatments from all the treatments
2. Match each of the treatment (using the propensity score) with a control from the same neighborhood
Here is how my data looks like
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double id float treatment str107 neighborhood int obs_in_neighborhood float pscore 20160075261 1 "Chandni ChowkSitaram BazarSitaram Bazar" 3 1.2193435 20160054725 1 "ChawlaChawlaChawla" 7 .9585301 20160036776 1 "ChawlaChawlaChawla" 7 1 20160084301 1 "ChawlaChawlaChawla" 7 1 20160009159 0 "ChawlaChawlaChawla" 7 .6421981 20160076470 0 "ChawlaChawlaChawla" 7 -2.9712344e-14 20160009039 0 "ChawlaChawlaChawla" 7 .4144993 20160054571 1 "ChawlaChawlaChawla" 7 1 20160025793 0 "ChhatarpurChattarpur ExtensionBlockC" 3 .6723786 20160062964 1 "ChhatarpurChattarpur ExtensionBlockC" 3 1.1135944 20160002471 1 "ChhatarpurChattarpur ExtensionBlockC" 3 .9245739 20160049683 1 "ChhatarpurChattarpur VillageBlockF" 2 1.058454 20160049329 0 "ChhatarpurChattarpur VillageBlockF" 2 1.058454 20160031993 0 "ChhatarpurChhatarpurChhatarpur" 11 .28676683 20160060143 0 "ChhatarpurChhatarpurChhatarpur" 11 .2207769 20160016135 0 "ChhatarpurChhatarpurChhatarpur" 11 .21608597 20160079698 1 "ChhatarpurChhatarpurChhatarpur" 11 .7343596 20160060092 1 "ChhatarpurChhatarpurChhatarpur" 11 .2854096 20160077995 1 "ChhatarpurChhatarpurChhatarpur" 11 .2854096 20160038198 0 "ChhatarpurChhatarpurChhatarpur" 11 .18566418 20160049244 0 "ChhatarpurChhatarpurChhatarpur" 11 .2300926 20160003892 1 "ChhatarpurChhatarpurChhatarpur" 11 1.0045972 20160038344 0 "ChhatarpurChhatarpurChhatarpur" 11 .18566418 20160016545 0 "ChhatarpurChhatarpurChhatarpur" 11 .21608597 20160086619 0 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony" 9 .2645294 20160085317 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony" 9 1.1441907 20160043172 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony" 9 .603401 20160085325 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony" 9 1.098295 20160088209 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony" 9 .4014449 20160051165 0 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony" 9 .2110851 20160085345 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony" 9 1.1617163 end
Comment