Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • psmatch2

    Hi everyone!

    I am trying to understand psmatch2 and wanted help with a few things.

    The problem:

    I am trying to match control firms based on a specific industry in a certain year (2019). From the various post on stata i have realised that this can be done by the following steps

    step 1 : Obtain a propensity score based on industry and specific year


    step2 : Use that pscore in the psmatch2 command

    step 3: run your regression

    I am facing problems in step 1 : How can i obtain the pscore based on industry from a specific year? Could you please specify which regression would i use ?

    I am following this link:

    HTML Code:
    https://www.statalist.org/forums/forum/general-stata-discussion/general/1669473-how-to-use-results-of-psmatch2-in-regression
    The commands i am trying right now are

    Code:
    logit treated INDUSTRY if year == 2019
    Code:
    predict double ps
    Code:
    psmatch2 treated if year == 2019, outcome(WACC) pscore(ps) neighbor(1) caliper (0.01)
    My reference year which i want to match my control firm is 2019.



    Any help comments would be greatly appreciated!


  • #2
    Øyvind Snilsberg Any suggestions please?

    Comment


    • #3
      can you post a data example?

      Comment


      • #4
        You don't need to calculate a propensity score in advance when using psmatch2 (Leuven and Sianesi, available from SSC), so you can skip the first step. You also don't need to run a separate regression (step 3). Here is a silly example that shows propensity score matching in one command:

        Code:
        . sysuse nlsw88
        (NLSW, 1988 extract)
        
        . psmatch2 married age grade south, outcome(union)
        
        Probit regression                                       Number of obs =  1,876
                                                                LR chi2(3)    =   1.38
                                                                Prob > chi2   = 0.7114
        Log likelihood = -1212.9288                             Pseudo R2     = 0.0006
        
        ------------------------------------------------------------------------------
             married | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
        -------------+----------------------------------------------------------------
                 age |  -.0096615   .0098233    -0.98   0.325    -.0289148    .0095918
               grade |   .0011241   .0116451     0.10   0.923    -.0216998    .0239481
               south |  -.0371692   .0604129    -0.62   0.538    -.1555762    .0812378
               _cons |   .7678532    .423047     1.82   0.070    -.0613037     1.59701
        ------------------------------------------------------------------------------
        ----------------------------------------------------------------------------------------
                Variable     Sample |    Treated     Controls   Difference         S.E.   T-stat
        ----------------------------+-----------------------------------------------------------
                   union  Unmatched | .228501229   .276335878  -.047834649   .020817873    -2.30
                                ATT | .228501229   .279279279  -.050778051   .050789401    -1.00
        ----------------------------+-----------------------------------------------------------
        Note: S.E. does not take into account that the propensity score is estimated.
        
                   | psmatch2:
         psmatch2: |   Common
         Treatment |  support
        assignment | On suppor |     Total
        -----------+-----------+----------
         Untreated |       655 |       655 
           Treated |     1,221 |     1,221 
        -----------+-----------+----------
             Total |     1,876 |     1,876
        It seems like you might need to learn more about propensity score matching in general, including setting a reasonable caliper. Please see https://www.statalist.org/forums/for...08#post1242208 and https://www.statalist.org/forums/for...64#post1661764.
        David Radwin
        Senior Researcher, California Competes
        californiacompetes.org
        Pronouns: He/Him

        Comment


        • #5
          Hi David,


          Thanks for your comments! Please correct me if i am wrong but I think i do need to run a pre- logit regression estimating propensity based on a specific (industry and year) and then use that propensity in the psmatch2 command to obtain a set of control firms based on specific industry and year?

          Comment


          • #6
            Originally posted by Øyvind Snilsberg View Post
            can you post a data example?
            I am unable to use dataex to post an example but please find below the code i used and the results i get.

            Code:
            *exact matching onindustyr and year
            egen industry_Year = group ( year INDUSTRY )
            logit treated i.year i.INDUSTRY ROA  Size 
            predict double pscore if e(sample)
            gen double pscore2 = industry_Year*1000+pscore
            rsort
            psmatch2 treated , out( WACC) n(1) caliper(2) pscore(pscore2) noreplacement 
            pstest ROA  Size
            Attached Files

            Comment


            • #7
              Originally posted by David Radwin View Post
              You don't need to calculate a propensity score in advance when using psmatch2 (Leuven and Sianesi, available from SSC), so you can skip the first step. You also don't need to run a separate regression (step 3). Here is a silly example that shows propensity score matching in one command:

              Code:
              . sysuse nlsw88
              (NLSW, 1988 extract)
              
              . psmatch2 married age grade south, outcome(union)
              
              Probit regression Number of obs = 1,876
              LR chi2(3) = 1.38
              Prob > chi2 = 0.7114
              Log likelihood = -1212.9288 Pseudo R2 = 0.0006
              
              ------------------------------------------------------------------------------
              married | Coefficient Std. err. z P>|z| [95% conf. interval]
              -------------+----------------------------------------------------------------
              age | -.0096615 .0098233 -0.98 0.325 -.0289148 .0095918
              grade | .0011241 .0116451 0.10 0.923 -.0216998 .0239481
              south | -.0371692 .0604129 -0.62 0.538 -.1555762 .0812378
              _cons | .7678532 .423047 1.82 0.070 -.0613037 1.59701
              ------------------------------------------------------------------------------
              ----------------------------------------------------------------------------------------
              Variable Sample | Treated Controls Difference S.E. T-stat
              ----------------------------+-----------------------------------------------------------
              union Unmatched | .228501229 .276335878 -.047834649 .020817873 -2.30
              ATT | .228501229 .279279279 -.050778051 .050789401 -1.00
              ----------------------------+-----------------------------------------------------------
              Note: S.E. does not take into account that the propensity score is estimated.
              
              | psmatch2:
              psmatch2: | Common
              Treatment | support
              assignment | On suppor | Total
              -----------+-----------+----------
              Untreated | 655 | 655
              Treated | 1,221 | 1,221
              -----------+-----------+----------
              Total | 1,876 | 1,876
              It seems like you might need to learn more about propensity score matching in general, including setting a reasonable caliper. Please see https://www.statalist.org/forums/for...08#post1242208 and https://www.statalist.org/forums/for...64#post1661764.
              can you guide me about this... https://www.statalist.org/forums/for...82-ps-matching

              Comment


              • #8
                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input float(id year wacc treated size)
                 1 2019    .851468 1 5
                 1 2020   .9820066 1 8
                 2 2019 .032479186 1 5
                 2 2020   .9874847 1 4
                 3 2019    .894106 1 6
                 3 2020   .9684734 1 8
                 4 2019  .23922028 1 8
                 4 2020   .6927336 1 6
                 5 2019   .4884359 1 6
                 5 2020   .4376452 1 5
                 6 2019   .5858005 1 7
                 6 2020   .3787092 1 8
                 7 2019   .6880603 1 7
                 7 2020   .9794578 1 5
                 8 2019   .6701937 1 5
                 8 2020   .5948808 1 7
                 9 2019   .7970893 1 7
                 9 2020   .7835853 1 7
                10 2019   .6546342 1 9
                10 2020  .09688907 1 8
                11 2019   .6885059 0 5
                11 2020    .872496 0 7
                12 2019  .52963525 0 7
                12 2020   .8302209 0 6
                13 2019   .9339853 0 6
                13 2020   .1749891 0 4
                14 2019   .5536171 0 7
                14 2020   .5346152 0 6
                15 2019   .7767794 0 9
                15 2020   .1288747 0 4
                16 2019  .27751842 0 8
                16 2020   .4242016 0 7
                17 2019  .13590056 0 6
                17 2020   .3325624 0 7
                18 2019   .4675523 0 6
                18 2020  .51608807 0 8
                19 2019  .06694305 0 8
                19 2020  .07229638 0 8
                20 2019   .6817465 0 8
                20 2020  .08804953 0 5
                end
                
                *estimate propensity scores based on size in 2019
                psmatch2 treated size if year==2019
                bys id (year): gen ps = _pscore[1]
                
                *estmate the effect of treatment in 2020 using propensity scores estimated based on size in 2019
                psmatch2 treated if year==2020, outcome(wacc) pscore(ps)

                Comment


                • #9
                  Originally posted by Faiza Zafar View Post
                  Please correct me if i am wrong but I think i do need to run a pre- logit regression estimating propensity based on a specific (industry and year) and then use that propensity in the psmatch2 command to obtain a set of control firms based on specific industry and year?
                  No, you don't need to do this, though you could. For a different approach, see the heading "Matching within strata" in the psmatch2 help file. You also might try both approaches and compare the results. Currently there is no consensus on the "best" or "correct" methods for matching overall nor for most specific situations like yours that seeks to match exactly on some covariates (industry and year) and not on others.
                  David Radwin
                  Senior Researcher, California Competes
                  californiacompetes.org
                  Pronouns: He/Him

                  Comment


                  • #10
                    Originally posted by Moomal Khan View Post
                    Please see this extra advice about bumping from the FAQ: https://www.statalist.org/forums/help#adviceextras
                    David Radwin
                    Senior Researcher, California Competes
                    californiacompetes.org
                    Pronouns: He/Him

                    Comment


                    • #11
                      Originally posted by David Radwin View Post

                      No, you don't need to do this, though you could. For a different approach, see the heading "Matching within strata" in the psmatch2 help file. You also might try both approaches and compare the results. Currently there is no consensus on the "best" or "correct" methods for matching overall nor for most specific situations like yours that seeks to match exactly on some covariates (industry and year) and not on others.
                      Hi David Radwin I have looked through all the 20 pages filed under the search 'Propensity scores' to find an answer to my own post which I didn't but anyway... (can't bump my post haha)

                      I wanted to asked you about this statement regarding the Statalist user asking re performing a logit regression model before psmatch2.

                      All articles published indicate that one should perform a logit regression with treatment as the outcome/dependent variable and the reset of the covariates as explanatory variables (step one)
                      Why are you saying to skip this step and move on to psmatch2 which uses probit regression but generates it's own propensity scores?
                      I know there isn't much evidence regarding difference between probit vs logit - which isn't my question here.

                      But What is the evidence/reason you recommend skipping the logit step?
                      Would be interesting to get Melissa Garrido point of view

                      Comment


                      • #12
                        The only reason you don't need to calculate propensity scores prior to using psmatch2 is that the program already calculates propensity scores (using your choice of probit or logit) by default and then matches based on the propensity scores. So it's not really skipping a step, but rather combining both steps in one command.
                        David Radwin
                        Senior Researcher, California Competes
                        californiacompetes.org
                        Pronouns: He/Him

                        Comment


                        • #13
                          Originally posted by David Radwin View Post
                          The only reason you don't need to calculate propensity scores prior to using psmatch2 is that the program already calculates propensity scores (using your choice of probit or logit) by default and then matches based on the propensity scores. So it's not really skipping a step, but rather combining both steps in one command.
                          Ok so perhaps an uncomfortable question…
                          why didn’t M Garrido in the article published here just recommend using psmatch2 rather than go through the hassle of doing logit first. Was it because the article came out before psmatch2?

                          https://pubmed.ncbi.nlm.nih.gov/24779867/

                          also is this the code psmatch2 for logit

                          psmatch2 treatment covariate1 covariate2, pscore(name_your_ps) outcome(outcome) caliper(.1) common logit noreplacement neighbor(1)

                          Comment


                          • #14
                            You may need to read the article more closely. Among other things, it cites psmatch2 with a date of 2003.

                            As to your second question, you have to choose to either use existing propensity scores using the pscore() option or include the covariates to be used to create a new variable with propensity scores. Your example code does both and will yield an error message.
                            David Radwin
                            Senior Researcher, California Competes
                            californiacompetes.org
                            Pronouns: He/Him

                            Comment


                            • #15
                              For anyone who stumbles across this... I'm still looking for a clean solution for exact matching in psmatch2, but I developed a workaround. The idea is to use preserve... restore and iterate through and subset on the exact matching criteria. In my case, this worked with Mahalanobis distance matching and NOT anything using logit/probit, unless you have sufficiently large samples within exact match criteria. Mahalanobis can find matches in small samples. For more: https://www.statalist.org/forums/for...-with-mahapick

                              Comment

                              Working...
                              X