Comparing between all possible pairs and rank them

Kumar Biswas

Join Date: Nov 2017
Posts: 1

Comparing between all possible pairs and rank them

26 Jun 2018, 08:34

Dear all,

I am working on a data set of social safety net benefits for elderly. Here, observation type (obs_type) '1' refers to 'elderly getting benefit (beneficiary)' and '0' refers to 'elderly not getting benefit (non-beneficiary)'.

I am trying to find all the possible pairs whether better off elderly are getting benefit even though there are non-beneficiaries who are worse off in the respective district. For eg. the test in District 101 is whether comparison pairs are found wherein non-beneficiaries are worse off than beneficiaries. A non-beneficiary is considered 'worse off' than the beneficiary when non-beneficiary is worse off in at least three criteria than the beneficiary. Also, I have more observations of non-beneficiaries than the beneficiaries.

Criteria 1: age – an older person will get priority
Criteria 2: income – an elderly with lower income will get priority
Criteria 3: land_size – an elderly with lower land size will get priority
Criteria 4: family_status - an elderly living alone will get priority (can be converted into a binary variable)
Criteria 5: health_status – an elderly who cannot work will get priority (can be converted into a binary variable)

I'm thinking of creating two new variables: 1) 'paired_id' referring to the id of the observation which is found to be 'worse off' than the compared non-beneficiaries (all possible pairs) and 2) rank - rank all possible pairs according to the no of criteria they are worse off? If no worse off non-beneficiaries are found, the cell will remain empty. How do I create these new variables?

Please find a sample dataset attached. I am using Stata 15.
Thank you,
Kumar

id	district	obs_type	age	income	land_size	family_status	health_status
1	101	1	55	2000	100	lives alone	can work
2	101	1	64	3000	120	lives with family	cannot work
3	101	0	34	3000	50	lives with family	can work
4	101	0	78	2300	40	lives alone	can work
5	101	0	62	1500	110	lives alone	cannot work
6	101	0	68	800	80	lives with family	cannot work
7	201	1	56	500	80	lives alone	can work
8	201	1	54	3000	120	lives with family	cannot work
9	201	0	87	4500	70	lives with family	can work
10	201	0	88	3800	60	lives alone	can work
11	201	0	72	2300	80	lives alone	cannot work
12	201	0	73	2000	90	lives with family	cannot work

Tags: None

Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

27 Jun 2018, 13:27

You'll increase your chances of a useful answer by following the FAQ on asking questions – provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

This would be relatively easy if you could generate a single index of well-being.

I'm not sure exactly how well your criteria and your multiple comparisons really work out. An elderly person with exceedingly high income is probably better off even if they have little land and live alone and cannot work.

It is hard to see how you want the data to be structured in the end. Are you looking to create a data set that each pair is an observation or what? Id one will have 5 comparisons. But the number of comparisons will vary with the number of folks in each district.

I suppose you could do it brute force with a bunch of loops (one for district, one for first in pair, and one for second in pair).
Comment

Announcement

Comparing between all possible pairs and rank them

Comment