Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing between all possible pairs and rank them

    Dear all,

    I am working on a data set of social safety net benefits for elderly. Here, observation type (obs_type) '1' refers to 'elderly getting benefit (beneficiary)' and '0' refers to 'elderly not getting benefit (non-beneficiary)'.

    I am trying to find all the possible pairs whether better off elderly are getting benefit even though there are non-beneficiaries who are worse off in the respective district. For eg. the test in District 101 is whether comparison pairs are found wherein non-beneficiaries are worse off than beneficiaries. A non-beneficiary is considered 'worse off' than the beneficiary when non-beneficiary is worse off in at least three criteria than the beneficiary. Also, I have more observations of non-beneficiaries than the beneficiaries.

    Criteria 1: age – an older person will get priority
    Criteria 2: income – an elderly with lower income will get priority
    Criteria 3: land_size – an elderly with lower land size will get priority
    Criteria 4: family_status - an elderly living alone will get priority (can be converted into a binary variable)
    Criteria 5: health_status – an elderly who cannot work will get priority (can be converted into a binary variable)

    I'm thinking of creating two new variables: 1) 'paired_id' referring to the id of the observation which is found to be 'worse off' than the compared non-beneficiaries (all possible pairs) and 2) rank - rank all possible pairs according to the no of criteria they are worse off? If no worse off non-beneficiaries are found, the cell will remain empty. How do I create these new variables?

    Please find a sample dataset attached. I am using Stata 15.
    Thank you,
    Kumar
    id district obs_type age income land_size family_status health_status
    1 101 1 55 2000 100 lives alone can work
    2 101 1 64 3000 120 lives with family cannot work
    3 101 0 34 3000 50 lives with family can work
    4 101 0 78 2300 40 lives alone can work
    5 101 0 62 1500 110 lives alone cannot work
    6 101 0 68 800 80 lives with family cannot work
    7 201 1 56 500 80 lives alone can work
    8 201 1 54 3000 120 lives with family cannot work
    9 201 0 87 4500 70 lives with family can work
    10 201 0 88 3800 60 lives alone can work
    11 201 0 72 2300 80 lives alone cannot work
    12 201 0 73 2000 90 lives with family cannot work

  • #2
    You'll increase your chances of a useful answer by following the FAQ on asking questions – provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    This would be relatively easy if you could generate a single index of well-being.

    I'm not sure exactly how well your criteria and your multiple comparisons really work out. An elderly person with exceedingly high income is probably better off even if they have little land and live alone and cannot work.

    It is hard to see how you want the data to be structured in the end. Are you looking to create a data set that each pair is an observation or what? Id one will have 5 comparisons. But the number of comparisons will vary with the number of folks in each district.

    I suppose you could do it brute force with a bunch of loops (one for district, one for first in pair, and one for second in pair).

    Comment

    Working...
    X