Matching pairs based on propensity score

Vijay Kumar

Join Date: Jul 2016
Posts: 24

Matching pairs based on propensity score

14 Aug 2016, 03:47

Dear Statalisters,

I have a school admissions dataset with about 15,000 observations. I want to draw a sample of 2000 (1000 treatments and 1000 controls) children and conduct a primary survey on them. Children are admitted to school based on a very complex lottery which is held at the neighborhood level. The only way a parent can improve her chances of admission is by applying to a large number of schools (within the neighborhood) and to the right (schools where the probability of admission is high) schools- the rest is random. Given the complex lottery, I am not able to identify groups whose ex-ante probability of admission is equal. I am therefore trying to do propensity score matching to create treatment and control groups that are comparable. I calculated the propensity scores of admission (being treated) for each observation using the application profile (schools chosen and the preference order). My sample has 60 percent treatments and 40 percent controls. I want to do the following:
1. randomly pick 1000 treatments from all the treatments
2. Match each of the treatment (using the propensity score) with a control from the same neighborhood
Here is how my data looks like

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double id float treatment str107 neighborhood int obs_in_neighborhood float pscore
20160075261 1 "Chandni ChowkSitaram BazarSitaram Bazar"           3      1.2193435
20160054725 1 "ChawlaChawlaChawla"                                7       .9585301
20160036776 1 "ChawlaChawlaChawla"                                7              1
20160084301 1 "ChawlaChawlaChawla"                                7              1
20160009159 0 "ChawlaChawlaChawla"                                7       .6421981
20160076470 0 "ChawlaChawlaChawla"                                7 -2.9712344e-14
20160009039 0 "ChawlaChawlaChawla"                                7       .4144993
20160054571 1 "ChawlaChawlaChawla"                                7              1
20160025793 0 "ChhatarpurChattarpur ExtensionBlockC"              3       .6723786
20160062964 1 "ChhatarpurChattarpur ExtensionBlockC"              3      1.1135944
20160002471 1 "ChhatarpurChattarpur ExtensionBlockC"              3       .9245739
20160049683 1 "ChhatarpurChattarpur VillageBlockF"                2       1.058454
20160049329 0 "ChhatarpurChattarpur VillageBlockF"                2       1.058454
20160031993 0 "ChhatarpurChhatarpurChhatarpur"                   11      .28676683
20160060143 0 "ChhatarpurChhatarpurChhatarpur"                   11       .2207769
20160016135 0 "ChhatarpurChhatarpurChhatarpur"                   11      .21608597
20160079698 1 "ChhatarpurChhatarpurChhatarpur"                   11       .7343596
20160060092 1 "ChhatarpurChhatarpurChhatarpur"                   11       .2854096
20160077995 1 "ChhatarpurChhatarpurChhatarpur"                   11       .2854096
20160038198 0 "ChhatarpurChhatarpurChhatarpur"                   11      .18566418
20160049244 0 "ChhatarpurChhatarpurChhatarpur"                   11       .2300926
20160003892 1 "ChhatarpurChhatarpurChhatarpur"                   11      1.0045972
20160038344 0 "ChhatarpurChhatarpurChhatarpur"                   11      .18566418
20160016545 0 "ChhatarpurChhatarpurChhatarpur"                   11      .21608597
20160086619 0 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9       .2645294
20160085317 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9      1.1441907
20160043172 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9        .603401
20160085325 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9       1.098295
20160088209 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9       .4014449
20160051165 0 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9       .2110851
20160085345 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9      1.1617163
end

Would be grateful for any advice!

Tags: None

Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#2

14 Aug 2016, 05:53

What does it mean to "match"? say I have 2 treatments in one neighborhood, one with ps of 0.18 the other with 0.67 . what control should be matched to which?
Also, it looks like maybe you should have a look at the user package synth, as it sounds your'e trying to so something similar. type ssc install synth, and then help synth to learn more.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4464
#3

14 Aug 2016, 06:24

us the sample command to randomly draw your 1000

then, it sounds to me like you want to download (from ssc) and install psmatch2
Comment
Vijay Kumar

Join Date: Jul 2016

Posts: 24
#4

14 Aug 2016, 22:01

Hi Ariel and Rich- thanks for your responses.

Sorry for not being clear on the definition of "match"- I meant 1-1 nearest neighbor matching. I installed synth, but it is probably not useful in my case as I don't have a balanced panel. psmatch2 with some clever coding seems to be the answer.

I know that the following code would generate matched pairs without any neighborhood restrictions (using all my observations)

psmatch2 treatment, pscore(pscore) common noreplace neighbor(1)
gen pair = _id if _treated==0
replace pair = _n1 if _treated==1
bysort pair: egen paircount = count(pair)

However, I am unable to figure out the matching within neighborhoods technique with my limited stata skill set. The help option provides this code for within strata matching
g att = .
egen g = group(groupvars)
levels g, local(gr)
qui foreach j of local gr {
psmatch2 treatvar varlist if g==`j', out(outvar)
replace att = r(att) if g==`j'
}
sum att

Using this, I wrote the following code to match within neighborhoods. I then plan to randomly draw 1000 matched pairs.
codebook neighborhood
egen g = group(neighborhood)
levels g
set seed 5089667
generate sort_id=uniform()
sort g sort_id
display "`r(levels)'"
qui foreach j in `r(levels)'{
psmatch2 treatment, pscore(pscore) if g==`j'
gen pair = _id if _treated==0 if g==`j'
replace pair = _n1 if _treated==1 if g==`j'
bysort pair: egen paircount = count(pair)
}
My code doesn't work- the error is 'option if not allowed'
I will be grateful for your thoughts and advice.
Comment
Vijay Kumar

Join Date: Jul 2016

Posts: 24
#5

15 Aug 2016, 21:45

Hi Rich and Ariel- I finally arrived at the right code for my requirement. Thanks again for your advice.
codebook neighborhood
egen g = group(neighborhood)
levels g
set seed 5089667
generate sort_id=uniform()
sort g sort_id
gen a_treated = .
gen a_support = .
gen a_id = .
gen a_n1 = .
gen a_nn = .
gen pair = .

set more off
display "`r(levels)'"
qui foreach j in `r(levels)' {
psmatch2 treatment if g==`j', pscore(pscore4) noreplace neighbor(1)
replace a_treated = _treated if g==`j'
replace a_support = _support if g==`j'
replace a_id = _id if g==`j'
replace a_n1 = _n1 if g==`j'
replace a_nn = _nn if g==`j'
replace pair = a_id if (_treated==0) & (g==`j')
replace pair = a_n1 if (_treated==1) & (g==`j')
}

bysort g pair: egen paircount = count(pair)
tab paircount
drop if paircount !=2
save paired, replace
1 like
Comment

Announcement

Matching pairs based on propensity score

Comment

Comment

Comment

Comment