Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combine nnmatch and psmatch2 function

    Hello,

    I'm looking to perform matching where certain variables are matched exactly (year and industry code), while others are matched based on the closest distance or smallest variance (sales and ebit margin). Next, I want to identify the control group individuals matched to each treatment group individual, utilizing a unique ID. Although `psmatch2` enables the immediate identification of these IDs post-matching, it lacks a feature for exact matching. On the opposite, `nnmatch` supports exact matching but doesn't readily display the IDs. Is there any solution to this issue? Thank you very much.

  • #2
    There is some fairly general code at #4 in https://www.statalist.org/forums/for...opensity-score for matching exactly on some variables and nearest neighbor matching on another variable. (In the example there, the nearest neighbor match is on a propensity score, but the same code, except for computing the propensity score itself, would work for any variable.)

    What you are proposing to do, however, is ill-defined. You cannot propose to do nearest neighbor matching on two or more variables: what would you do if the nearest neighbor on one of the variables and the nearest neighbor on the other lead to different observations. There are ways of handling this. One is to prioritize nearest neighbor matching on one of the variables, and then, if there is more than one nearest neighbor for that variable, among them select the nearest neighbor on the other. Another is to combine the two variables in some way, perhaps a weighted average of some kind, and do a nearest neighbor match on that. In any case, without some way of either imposing a priority on the variables or combining them, you can't go that route.

    If you resolve that problem, and if you do not see how to adapt the code in the linked post for your purpose, do post back. But when you do that, be sure to post example data from your dataset using the -dataex- command. Be sure to choose example data that includes some matches.

    If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Hi, thank you for you reply. I tried the general code as suggested and they work quite well. However, I do see your point in the ill-defined issue. How should I modify the general code at #4 in https://www.statalist.org/forums/for...opensity-score if I want to prioritize sales over ebit, or if I want to put weight as 80% sales and 20% ebit? Thank you a lot.

      Example data:
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int id byte treatment str10 identifier str30 name int(year sic) double(sales ebit)
        1 0 "IBM.N"      "York International Corp"        1985 3585            45937            24.4487
        2 0 "ABE^F95"    "Abex Inc"                       1991 3728            824.4            6.44105
        3 0 "ATR"        "AptarGroup Inc"                 1992 3089           314.31            8.90681
        4 0 "AZN.ST"     "Zeneca"                         1992 2834 6928.31890284917           75.28633
        5 0 "GHC^I93"    "Galen Health Care Inc"          1992 8062             4000                 15
        6 0 "PZM.N^A00"  "Pittston Minerals Group"        1992 1241            582.1            1.24309
        7 0 "EMN"        "Eastman Chemical Co"            1993 2821             3614           14.77587
        8 0 "AVL^I06"    "Aviall Inc"                     1992 3769         1209.766            7.73133
        9 0 "MICM.O^F96" "MICOM Communications Corp"      1993 3669           61.245            4.99469
       10 0 "RAH^A13"    "Ralcorp Holdings Inc"           1993 2048            870.6            7.29382
       11 0 "ALB"        "Albemarle Corp"                 1993 2869          790.988           10.10521
       12 0 "DCR.N^D00"  "Duff & Phelps Credit Rating Co" 1993 7323           24.697           11.54796
       13 0 "NVAX.O"     "Novavax Inc"                    1994 2836              .58         -754.48276
       14 0 "STRT.O"     "Strattec Security Corp"         1994 3714           82.005            2.21206
       15 0 "DRI"        "Darden Restaurants Inc"         1994 5812         2737.044            7.57368
       16 0 "HDTC.O^B98" "Healthdyne Technologies Inc"    1994 3841           68.598           13.41439
       17 0 "PPG"        "US Industries Inc"              1994 3431           5753.9           12.45416
       18 0 "AAI^E11"    "Airways Corp(AirTran Corp)"     1994 4512            5.811          -14.91998
       19 0 "CUL^F98"    "Culligan Water Technologies"    1994 3589          264.073            2.31072
       20 0 "UPR.N^G00"  "Union Pacific Resources Group"  1995 1311           1332.9           26.35607
       21 0 "ENDO.O^H09" "Endocare Inc"                   1995 3845            3.177          -16.17879
       22 0 "DGX"        "Quest Diagnostics Inc"          1995 8071         1633.699           12.81313
       23 0 "ADIC.O^H06" "Advanced Digital Information"   1995 3572           20.083            2.74361
       24 0 "DEL^B18"    "Deltic Timber Corp"             1995 2421           92.457           30.68994
       25 0 "MRC.N^K97"  "Monterey Resources Inc"         1996 1311            218.7           34.79652
       26 0 "GLH.L^D07"  "Gallaher Ltd"                   1996 2111           1416.7           24.22531
       27 0 "MWY.N^B09"  "Midway Games Inc"               1997 7372          245.423           16.49968
       28 0 "ECM^K98"    "Emerging Communications Inc"    1996 4813           195.67           23.75275
       29 0 "VL^A01"     "Vlasic Foods International Inc" 1997 2038         1498.967            8.41726
       30 0 "PME.N^F03"  "Penton Publishing"              1997 2721          188.557            9.81083
       31 0 "DRTE.O^E07" "IMS International Inc"          1997 8732           76.883            3.12163
       32 0 "CVG^J18"    "Convergys Corp"                 1997 7373            842.4           14.74359
       33 0 "LTC"        "LTC Healthcare Inc"             1997 8099            54.93           80.17295
       34 0 "CNXT.O^D11" "Conexant Systems Inc"           1997 3674         1412.325           14.06822
       35 0 "ARJ^J11"    "Arch Chemicals Inc"             1998 2819            929.9            8.49554
       36 0 "PHCC.O^J05" "Priority Healthcare Corp"       1998 5912          158.247            4.87908
       37 0 "HIFN.O^D09" "Hi/fn Inc"                      1997 7372           14.226           21.45368
       38 0 "ABI^K08"    "Celera Genomics Corp"           1998 8731          767.465           13.07187
       39 0 "EVRC.PK"    "Evercel Inc"                    1998 3691             .436         -198.85321
       40 0 "HDD^D01"    "Quantum HDD(Quantum Corp)"      1998 3572         4615.435           -1.61192
       41 0 "A"          "Agilent Technologies Inc"       1999 3829             7952            5.55835
       42 0 "LR^A01"     "Lanier Worldwide Inc"           1998 7629         1199.885            9.22622
       43 0 "SNC^I00"    "Circle.com(Snyder Comm Inc)"    1998 7389          403.072            8.88228
       44 0 "GRP^D08"    "Grant PrideCo Inc"              1999 3494          646.454           18.56698
       45 0 "LONN.S"     "Lonza AG"                       1998 2836             9020           11.96231
       46 0 "GTIV.O^B15" "Gentiva Health Services Inc"    1999 8082         1433.854            4.20768
       47 0 "USNU.PK"    "US Neurosurgical Inc"           1998 8093             1.83           21.96721
       48 0 "RETK.O^D05" "Retek Inc"                      1999 7372           55.033           18.65608
       49 0 "DHA^H01"    "Duck Head Apparel Co"           1999 2325           83.953           -1.49012
       50 0 "EFD^I07"    "eFunds Corp"                    1999 7374           267.52           -9.94729
       51 0 "CUK.TH"     "P&O Princess Cruises PLC"       1999 4481           1852.4           18.52732
       52 0 "AV^J07"     "Avaya Inc"                      1999 3661             7754            7.93139
       53 0 "KME^G02"    "Key3Media Group Inc"            1999 7389          269.135           28.14573
       54 0 "SYD^E06"    "Sybron Dental Specialties Inc"  1999 3843          392.249           24.39726
       55 0 "CTAL.F^L00" "Catalytica Pharmaceuticals Inc" 1999 2819          406.459            8.91701
       56 0 "PRWK.O^J03" "PracticeWorks Inc"              2000 7372           54.591           16.12354
       57 1 "VAS^F07"    "Viasys Healthcare Inc"          2000 3845          358.553           14.17085
       58 1 "FLO"        "Flowers Foods Inc"              2000 2053         1511.386            2.70785
       59 1 "COL^K18"    "Rockwell Collins Inc"           2000 3812             2438           17.47334
       60 1 "TEU^L05"    "CP Ships Ltd"                   2000 4412             1878            4.89883
       61 1 "SVIK.ST"    "Studsvik AB"                    2000 3829          773.281            7.29011
       62 1 "ZBH"        "Zimmer Holdings Inc"            2000 3841              939           55.05857
       63 1 "KSL^G05"    "Kaneb Services LLC"             2000 1799          370.326           17.52807
       64 1 "OOM^C05"    "mmO2 PLC"                       2000 4812             2836           10.04937
       65 1 "FHR.TO^E06" "Fairmont Hotels & Resorts Inc"  2000 7011            490.7           31.07805
       66 1 "FTI^A17"    "FMC Technologies Inc"           2000 5084           1953.1              6.098
       67 1 "GPRO.O^H12" "Gen-Probe Inc"                  2001 2835          119.541            -1.6396
       68 1 "BHS^D11"    "Brookfield Homes Corp"          2002 1522          669.735           11.87947
       69 1 "MPAC.O^J13" "MOD PAC CORP"                   2002 2652           26.464            14.0266
       70 1 "PSRC.O^K05" "PalmSource Inc"                 2002 3571            44.95         -105.97998
       71 1 "CVCO.O"     "Cavco Industries Inc"           2002 2451           95.728            3.93197
       72 1 "MHS^D12"    "Medco Health Solutions Inc"     2002 5912          29070.6             2.1348
       73 1 "BIOV.O^G07" "BioVeris Corp"                  2003 2836           13.243         -288.18244
       74 1 "HSP^I15"    "Hospira Inc"                    2003 3841          2602.55           14.53179
       75 1 "CBMX.O^K17" "CombiMatrix Corp"               2006 2836            8.033         -166.71231
       76 1 "PM"         "Philip Morris Intl Inc"         2007 2111            20769           38.75488
       77 1 "SNI.O^C18"  "Scripps Networks"               2007 7375         1323.469           35.36071
       78 1 "CXS.F^G11"  "Autogen Research Ltd"           2006 8731            2.297         -451.41489
       79 1 "JBT"        "John Bean Technologies Corp"    2007 3823            844.3            5.96944
       80 1 "HSN.AX"     "HSN"                            2007 5961           49.482           -2.52415
       81 0 "007025"     "MARGAUX INC"                    1985 3585           14.722 -52.01738894171988
       82 0 "009218"     "ROHR INC"                       1991 3728         1385.086  7.261498563988085
       83 0 "012284"     "INTL CONTAINER SYSTEMS INC"     1992 3089             8.63  8.157589803012744
       84 0 "243869"     "TONGHUA DONGBAO PHARMA CO"      1992 2834           97.362 61.374047369610324
       85 0 "001472"     "AMER HEALTHCARE MGMT"           1992 8062          313.197  8.032324702982468
       86 0 "004061"     "DOW CORNING CORP"               1993 2821           2043.7 11.542790037676763
       87 0 "003892"     "DETECTION SYSTEMS INC"          1993 3669           31.355  4.917875936852177
       88 0 "004423"     "EQUIFAX INC"                    1993 7323         1217.217 13.758187734808173
       89 0 "028937"     "EPOCH BIOSCIENCES INC"          1994 2836            1.429 -781.2456263121062
       90 0 "210322"     "BBS KRAFTFAHRZEUGTECHNIK AG"    1994 3714          168.213 2.4914840113427617
       91 0 "012533"     "EATERIES INC"                   1994 5812           39.136 1.9496116107931314
       92 0 "101157"     "RADIOMETER AS"                  1994 3841         1662.043  16.20120538397623
       93 0 "029057"     "ZARGON OIL & GAS LTD"           1995 1311             6.95 23.496402877697843
       94 0 "024853"     "SPECTRANETICS CORP"             1995 3845           17.282 -16.19604212475408
       95 0 "024691"     "HEALTHCARE INTEGRATED SVCS"     1995 8071            9.249  9.363174397232132
       96 0 "016729"     "MIDGARDXXI INC"                 1995 3572          374.147 3.2618195522080895
       97 0 "062359"     "SINO-FOREST CORP"               1995 2421           37.448 31.878872035889767
       98 0 "007137"     "MAYNARD OIL CO"                 1996 1311           30.583    34.355687800412
       99 0 "208201"     "SOUZA CRUZ SA"                  1996 2111           1781.9  24.30551658342219
      100 0 "216091"     "SER SYSTEMS AG"                 1997 7372           54.766 16.119490194646314
      end
      Last edited by Huong Duong; 21 Feb 2024, 10:42.

      Comment


      • #4
        So something like this:
        Code:
        isid id
        
        //    MATCHING
        local match_vars year sic
        ds treatment `match_vars', not
        local non_match_vars `r(varlist)'
        
        preserve
        keep if !treatment
        rename (`non_match_vars') =_ctrl
        drop treatment
        tempfile controls
        save `controls'
        
        restore
        keep if treatment
        rename (`non_match_vars') =_case
        drop treatment
        joinby `match_vars' using `controls',
        
        //    NOW SELECT CLOSEST PROPENSITYSCORE MATCH, BREAKING TIES AT RANDOM BUT REPRODUCIBLY
        gen delta_ebit = abs(ebit_case - ebit_ctrl)
        gen delta_sales = abs(sales_case - sales_ctrl)
        
        set seed 1234 // OR WHATEVER RANDOM NUMBER SEED YOU LIKE
        gen double shuffle = runiform()
        
        //    TO PRIORITIZE SALES OVER EBIT
        by id_case (delta_sales delta_ebit shuffle), sort: keep if _n == 1
        
        //    TO USE AN 80-20 WEIGHTED AVERAGE OF SALES AND EBIT
        gen mixture = 0.8*delta_sales + 0.2*delta_ebit
        by id_case (mixture shuffle), sort: keep if _n == 1
        NOTES:

        I assume that the industry variable is sic.

        You cannot both prioritize sales over ebit and use an 80-20 average of sales. You must pick one or the other and delete (or comment out) the code for the other.

        This code is not completely tested. The example data you gave contains only cases (treatment = 1), with no potential matchable controls. Nevertheless, having used this kind of code extensively before, I'm pretty sure I have not introduced any errors in marking it up to meet your specific problem.

        Comment


        • #5
          Hi,

          Thank you very much for the codes. They work very well. Your help is greatly appreciated!

          Comment

          Working...
          X