Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I do not think you should evaluate your matched samples solely on the basis of one pair of graphs, but this pair of graphs does suggest that the balance is worse after matching. That is not so surprising. There is no guarantee that matching improves balance. As Sekhon (2011) writes:

    A significant shortcoming of common matching methods such as Mahalanobis distance and propensity score matching is that they may (and in practice, frequently do) make balance worse across measured potential confounders. These methods may make balance worse, in practice, even if covariates are distributed ellipsoidally because in a given finite sample there may be departures from an ellipsoidal distribution.
    For an example where matching worsens balance, see p. 11.

    Sekhon, J. 2011. Matching: Multivariate Matching with Automated Balance Optimization in R. Journal of Statistical Software 42(7):1-52.
    David Radwin
    Senior Researcher, California Competes
    californiacompetes.org
    Pronouns: He/Him

    Comment


    • #17

      Dear Sir,
      I sincerely appreciate your time in assisting me on this. After going back to your details, I tried to run the nearest neighbor matching and after auto generating _n and _id, I run the codes provided at post #7 above and got the graph below, which looks better now.
      But my question now is, if I had to run a kernel matching, what would I replace with _n and _id on the codes of post #7 to get the graph? Or this is only possible with nearest neighbor matching?
      Thank you so much David.
      Alhassane.
      Click image for larger version

Name:	Screen Shot 2016-03-19 at 9.47.02 PM.png
Views:	1
Size:	202.0 KB
ID:	1331541



      Last edited by Alhassane Bah; 19 Mar 2016, 09:04.

      Comment


      • #18
        There is no nearest neighbor, and therefore no variables to indicate nearest neighbor, if you don't use nearest neighbor matching.

        If you want to reproduce Richard Hofler's graph after kernel matching, you can weight the results instead. This example does so and makes two additional changes. It saves the graphs to memory rather than to disk and it uses Vince Wiggins's grc1leg program I mentioned earlier.

        Code:
        sysuse auto, clear
        psmatch2 foreign mpg, out(price)
        
        * before
        twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0, ///
        lpattern(dash)), legend( label( 1 "treated") label( 2 "control" ) ) ///
        xtitle("propensity scores BEFORE matching") name(before, replace)
        
        * after
        twoway (kdensity _pscore if _treated==1 [aweight=_weight]) ///
        (kdensity _pscore if _treated==0 [aweight=_weight] ///
        , lpattern(dash)), legend( label( 1 "treated") label( 2 "control" )) ///
        xtitle("propensity scores AFTER matching") name(after, replace)
        
        * combined
        grc1leg before after, ycommon
        Click image for larger version

Name:	kernel.png
Views:	1
Size:	16.5 KB
ID:	1331838
        David Radwin
        Senior Researcher, California Competes
        californiacompetes.org
        Pronouns: He/Him

        Comment


        • #19
          Thank you so much Sir.

          Comment


          • #20
            I have problems drawing propensity score distribution graph showing region of common support for treated and non-treated group. What command can I use to do this? the graph looks like the one attached
            Attached Files

            Comment


            • #21
              Edit: oh I didn't see the second page. So just ignore my post.

              Alhassane,

              what is the code your using to produce these graphs? Most likely, you should not just flip the title of the graphs to get the result/improvement you want.

              The simplest way is probably to use the -pstest- command.

              Code:
              sysuse auto, clear
              psmatch2 foreign mpg, out(price) kernel
              pstest _pscore, density both
              In this case, -pstest- knows what to do depending on the matching procedure used (nearest-neighbor, kernel, radius etc.). Those matching methods, like kernel matching, re-weight the initial propensity score to obtain a matched sample In contrast, nearest-neighbor matching uses the non-weighted propensity score, but drops the observations for which no matched counterpart exists.

              What -pstest- does in my example above is essentially to create a (or since the both option is specified two) -twoway kernel- graphs with using weights (obtained by the matching). Maybe the source code of the command gives you a better indication about what is done (note the used aw weights - [aw=`mweight'] ) :

              Code:
              twoway
              (kdensity `varlist' if `touse' & `treated'==1 [aw=`mweight'], clwid(thick))
              (kdensity `varlist' if `touse' & `treated'==0 [aw=`mweight'], clwid(thin) clcolor(black)), xlab(#6) xti("") yti("")  title("`Ytitle'")
              subtitle("Matched samples") legend(order(1 "Treated" 2 "Untreated")) graphregion(color(gs16))  `options'
              Of course, it is possible that the matching procedure does more harm than good. If that happens you should re-consider your model which is used to estimate the propensity score and/or the chosen matching method (they all have advantages and disadvantages).
              Last edited by Sebastian Geiger; 01 Jun 2016, 20:57.

              Comment


              • #22
                Ronald,

                maybe try something like this

                Code:
                sysuse auto, clear
                psmatch2 foreign mpg, out(price)
                
                sum _pscore if _treat==1
                local minsupport = r(min)
                sum _pscore if _treat==0
                local maxsupport = r(max)
                dis "`minsupport'"
                dis "`maxsupport'"
                
                gen match=_n1
                replace match=_id if match==.
                duplicates tag match, gen(dup)
                
                twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0 ///
                , lpattern(dash)), legend( label( 1 "treated") label( 2 "control" )) ///
                xtitle("propensity scores AFTER matching") saving(after, replace) ///
                xline(`minsupport') xline(`maxsupport')
                Probably, it won't look as nice with "real" data as it does in your drawn picture.
                Last edited by Sebastian Geiger; 01 Jun 2016, 21:17.

                Comment


                • #23
                  Thank you all for posting in this thread. Building from Richard Hofler's syntax from April 2014 and re-pasted below, does anyone know how to alter the syntax when doing a 1 to 3 nearest neighbor match as opposed to a 1 to 1 match? So, therefore, I have _n1, _n2, and _n3 to work with. I would rather not plot 3 separate lines for each of these compared to my treated group, but rather a single line for all matched controls, if possible.

                  // compare _pscores before matching & save graph to disk
                  twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0, ///
                  lpattern(dash)), legend( label( 1 "treated") label( 2 "control" ) ) ///
                  xtitle("propensity scores BEFORE matching") saving(before, replace)

                  // compare _pscores *after* matching & save graph to disk
                  gen match=_n1
                  replace match=_id if match==.
                  duplicates tag match, gen(dup)
                  twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0 ///
                  & dup>0, lpattern(dash)), legend( label( 1 "treated") label( 2 "control" )) ///
                  xtitle("propensity scores AFTER matching") saving(after, replace)

                  // combine these two graphs that were saved to disk
                  // put both graphs on y axes with common scales
                  graph combine before.gph after.gph, ycommon

                  Thanks in advance,
                  Julie Lima
                  Brown University

                  Comment


                  • #24
                    Julie Lima, I realize this is somewhat late but here goes if other needs.
                    You need to generate matches for _n1 _n2 and _n3 (same way as described above, but change names)
                    Then you need to run the duplicates command on all three matches, and finally generate a combined dup.

                    Like this

                    gen matchn1=_n1
                    replace matchn1=_id if matchn1==.

                    gen matchn2=_n2
                    replace matchn2=_id if matchn2==.

                    gen matchn3=_n3
                    replace matchn3=_id if matchn3==.

                    duplicates tag matchn1, gen(dupn1)

                    duplicates tag matchn2, gen(dupn2)

                    duplicates tag matchn3, gen(dupn3)

                    gen dup=dupn1+dupn2+dupn3


                    Then use the code from above
                    twoway (kdensity _pscore if _treated==1) (kdensity _pscore if _treated==0 ///
                    & dup>0, lpattern(dash)), legend( label( 1 "treated") label( 2 "control" )) ///
                    xtitle("propensity scores AFTER matching") saving(after, replace)

                    Comment


                    • #25
                      Hi all. A quick question: When I used the syntax David Radwin kindly suggested, I got the figure below. What does this figure indicate? What does it mean that the propensity scores are never larger that 0.4?

                      CS.gph

                      Thanks in advance.

                      Comment


                      • #26
                        Hi all. A quick question: When I used the syntax David Radwin kindly suggested, I got the figure below. What does this figure indicate? What does it mean that the propensity scores are never larger that 0.4? I re-attach the figure.

                        CS.gph

                        Click image for larger version

Name:	CS.jpg
Views:	2
Size:	258.4 KB
ID:	1485582

                        Thanks in advance.
                        Attached Files
                        Last edited by amira elshal; 26 Feb 2019, 12:15.

                        Comment


                        • #27
                          The point of graphs like these is to visually inspect and show the closeness of the two groups and the overlap between them, before and after matching.

                          It looks like it is the case that the propensity scores are never larger than 0.4 before or after matching, but a better way to ascertain this fact is something like
                          Code:
                          summarize _pscore
                          possibly limiting the analysis to the matched samples, then checking that the maximum value is less than 0.4.
                          David Radwin
                          Senior Researcher, California Competes
                          californiacompetes.org
                          Pronouns: He/Him

                          Comment


                          • #28
                            Thanks for the prompt reply. I double checked; yes the propensity scores are never higher than 0.35. Is that okay? I understand that the propensity score is the probability of treatment. I am studying a health sector reform.

                            Comment


                            • #29
                              I don't think there is any reason why a lower maximum propensity score is better than a higher one. (Of course, by construction, they are always between 0 and 1.)

                              The point of matching is to get the propensity scores (and other statistics) of the treated and control groups to be as similar as possible (in other words, to be balanced) and to overlap. Regarding overlap, you do not want the treatment group to have a much higher max p score than the control group, or vice-versa, after matching.
                              David Radwin
                              Senior Researcher, California Competes
                              californiacompetes.org
                              Pronouns: He/Him

                              Comment


                              • #30
                                So, given the figure I posted, is this common support or overlap area acceptable? Also, the figure shows that the control group has a much higher max p score than the treatment group, right? Thanks.

                                Comment

                                Working...
                                X