Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matching pairs based on propensity score

    Dear Statalisters,

    I have a school admissions dataset with about 15,000 observations. I want to draw a sample of 2000 (1000 treatments and 1000 controls) children and conduct a primary survey on them. Children are admitted to school based on a very complex lottery which is held at the neighborhood level. The only way a parent can improve her chances of admission is by applying to a large number of schools (within the neighborhood) and to the right (schools where the probability of admission is high) schools- the rest is random. Given the complex lottery, I am not able to identify groups whose ex-ante probability of admission is equal. I am therefore trying to do propensity score matching to create treatment and control groups that are comparable. I calculated the propensity scores of admission (being treated) for each observation using the application profile (schools chosen and the preference order). My sample has 60 percent treatments and 40 percent controls. I want to do the following:
    1. randomly pick 1000 treatments from all the treatments
    2. Match each of the treatment (using the propensity score) with a control from the same neighborhood
    Here is how my data looks like
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double id float treatment str107 neighborhood int obs_in_neighborhood float pscore
    20160075261 1 "Chandni ChowkSitaram BazarSitaram Bazar"           3      1.2193435
    20160054725 1 "ChawlaChawlaChawla"                                7       .9585301
    20160036776 1 "ChawlaChawlaChawla"                                7              1
    20160084301 1 "ChawlaChawlaChawla"                                7              1
    20160009159 0 "ChawlaChawlaChawla"                                7       .6421981
    20160076470 0 "ChawlaChawlaChawla"                                7 -2.9712344e-14
    20160009039 0 "ChawlaChawlaChawla"                                7       .4144993
    20160054571 1 "ChawlaChawlaChawla"                                7              1
    20160025793 0 "ChhatarpurChattarpur ExtensionBlockC"              3       .6723786
    20160062964 1 "ChhatarpurChattarpur ExtensionBlockC"              3      1.1135944
    20160002471 1 "ChhatarpurChattarpur ExtensionBlockC"              3       .9245739
    20160049683 1 "ChhatarpurChattarpur VillageBlockF"                2       1.058454
    20160049329 0 "ChhatarpurChattarpur VillageBlockF"                2       1.058454
    20160031993 0 "ChhatarpurChhatarpurChhatarpur"                   11      .28676683
    20160060143 0 "ChhatarpurChhatarpurChhatarpur"                   11       .2207769
    20160016135 0 "ChhatarpurChhatarpurChhatarpur"                   11      .21608597
    20160079698 1 "ChhatarpurChhatarpurChhatarpur"                   11       .7343596
    20160060092 1 "ChhatarpurChhatarpurChhatarpur"                   11       .2854096
    20160077995 1 "ChhatarpurChhatarpurChhatarpur"                   11       .2854096
    20160038198 0 "ChhatarpurChhatarpurChhatarpur"                   11      .18566418
    20160049244 0 "ChhatarpurChhatarpurChhatarpur"                   11       .2300926
    20160003892 1 "ChhatarpurChhatarpurChhatarpur"                   11      1.0045972
    20160038344 0 "ChhatarpurChhatarpurChhatarpur"                   11      .18566418
    20160016545 0 "ChhatarpurChhatarpurChhatarpur"                   11      .21608597
    20160086619 0 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9       .2645294
    20160085317 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9      1.1441907
    20160043172 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9        .603401
    20160085325 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9       1.098295
    20160088209 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9       .4014449
    20160051165 0 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9       .2110851
    20160085345 1 "ChhatarpurDr. Ambedkar ColonyDr. Ambedkar Colony"  9      1.1617163
    end
    Would be grateful for any advice!

  • #2
    What does it mean to "match"? say I have 2 treatments in one neighborhood, one with ps of 0.18 the other with 0.67 . what control should be matched to which?
    Also, it looks like maybe you should have a look at the user package synth, as it sounds your'e trying to so something similar. type ssc install synth, and then help synth to learn more.

    Comment


    • #3
      us the sample command to randomly draw your 1000

      then, it sounds to me like you want to download (from ssc) and install psmatch2

      Comment


      • #4
        Hi Ariel and Rich- thanks for your responses.

        Sorry for not being clear on the definition of "match"- I meant 1-1 nearest neighbor matching. I installed synth, but it is probably not useful in my case as I don't have a balanced panel. psmatch2 with some clever coding seems to be the answer.

        I know that the following code would generate matched pairs without any neighborhood restrictions (using all my observations)

        psmatch2 treatment, pscore(pscore) common noreplace neighbor(1)
        gen pair = _id if _treated==0
        replace pair = _n1 if _treated==1
        bysort pair: egen paircount = count(pair)

        However, I am unable to figure out the matching within neighborhoods technique with my limited stata skill set. The help option provides this code for within strata matching
        g att = .
        egen g = group(groupvars)
        levels g, local(gr)
        qui foreach j of local gr {
        psmatch2 treatvar varlist if g==`j', out(outvar)
        replace att = r(att) if g==`j'
        }
        sum att

        Using this, I wrote the following code to match within neighborhoods. I then plan to randomly draw 1000 matched pairs.
        codebook neighborhood
        egen g = group(neighborhood)
        levels g
        set seed 5089667
        generate sort_id=uniform()
        sort g sort_id
        display "`r(levels)'"
        qui foreach j in `r(levels)'{
        psmatch2 treatment, pscore(pscore) if g==`j'
        gen pair = _id if _treated==0 if g==`j'
        replace pair = _n1 if _treated==1 if g==`j'
        bysort pair: egen paircount = count(pair)
        }
        My code doesn't work- the error is 'option if not allowed'
        I will be grateful for your thoughts and advice.

        Comment


        • #5
          Hi Rich and Ariel- I finally arrived at the right code for my requirement. Thanks again for your advice.
          codebook neighborhood
          egen g = group(neighborhood)
          levels g
          set seed 5089667
          generate sort_id=uniform()
          sort g sort_id
          gen a_treated = .
          gen a_support = .
          gen a_id = .
          gen a_n1 = .
          gen a_nn = .
          gen pair = .

          set more off
          display "`r(levels)'"
          qui foreach j in `r(levels)' {
          psmatch2 treatment if g==`j', pscore(pscore4) noreplace neighbor(1)
          replace a_treated = _treated if g==`j'
          replace a_support = _support if g==`j'
          replace a_id = _id if g==`j'
          replace a_n1 = _n1 if g==`j'
          replace a_nn = _nn if g==`j'
          replace pair = a_id if (_treated==0) & (g==`j')
          replace pair = a_n1 if (_treated==1) & (g==`j')
          }

          bysort g pair: egen paircount = count(pair)
          tab paircount
          drop if paircount !=2
          save paired, replace

          Comment

          Working...
          X