Psmatch - low match even for one variable

Lexi Mastalerz

Join Date: Jun 2023

Posts: 22
#1

Psmatch - low match even for one variable

03 Jul 2024, 11:56

Hello,

I am having an issue running the psmatch2 code and getting a decent match. My full sample has 1 million + observations, and from 1 million observations even if I limit my characteristics to just SEX (male and female), I still only get 170 matches? My variable is tabulated very clearly:

'msgender' | Freq. Percent Cum.
------------+-----------------------------------
0 | 863,931 80.33 80.33
1 | 211,500 19.67 100.00
------------+-----------------------------------
Total | 1,075,431 100.00

Similar with age, when dropping variables to just include age and treatment, I get very few matches. So the issue is not with the variable... I think. My code for using matching based on just sex is:

psmatch2 treatment i.sex
psgraph
pstest i.sex

sum i.sex if treatment ==1 [aw =_weight]
sum i.sex if treatment ==0 [aw =_weight]

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte sex float(mar tax_year age treatment) 0 1 2011 60 1 1 1 2010 41 1 1 1 2011 42 1 0 1 2010 52 1 0 .a 2011 53 1 0 1 2010 57 1 0 1 2011 58 1 1 1 2011 34 1 1 1 2010 38 1 1 1 2011 39 1 0 1 2010 34 1 0 1 2011 35 1 0 2 2010 66 1 0 2 2011 67 1 0 2 2010 62 1 0 2 2011 63 1 0 1 2010 36 1 0 2 2010 21 1 0 2 2010 64 1 0 2 2011 65 1 0 1 2010 61 0 0 1 2011 62 0 0 3 2010 48 0 0 3 2011 49 0 0 2 2010 30 0 0 2 2011 31 0 0 1 2010 34 0 0 1 2011 35 0 0 1 2010 45 0 0 1 2011 46 0 end label values mar marLbl label def marLbl 1 "Couple", modify label def marLbl 2 "Single", modify label def marLbl 3 "Wid_Div_Sep", modify label def marLbl .a "Missing/Invalid", modify

------------------ copy up to and including the previous line ------------------

Please help!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

03 Jul 2024, 12:14

In your example data, sex == 1 always co-occurs with treatment == 1, so the attempt to build a propensity model of treatment based on sex fails: sex being a perfect predictor, it is omitted from the propensity model.

If this is also true in your full data set, then that is, if not the sole cause of your problem, certainly a contributory problem because it means that no propensity score can be calculated for sex == 1 observations, so they are not candidates to match anything. If it is not also true in your full data set, then you need to post a better data example, one that reflects the general distribution of these variables in the full data set, and that also reproduces the problem you are having.
Comment
Lexi Mastalerz

Join Date: Jun 2023

Posts: 22
#3

03 Jul 2024, 13:33

Hello,

I wish that was it. Here is the tabulation of sex across treatment:

| treatment
'msgender' | 0 1 | Total
-----------+--------------------------------+----------
0 | 362,138 501,793 | 863,931
1 | 91,484 120,016 | 211,500
-----------+----------------------+----------
Total | 453,622 621,809 | 1,075,431
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

03 Jul 2024, 13:36

So, for troubleshooting, you need to post a better example of your data: one that reflects the variables' distributions in the data set as a whole and reproduces the problem you get when you apply -psmatch20-.
Comment

Announcement

Psmatch - low match even for one variable

Comment

Comment

Comment