Amending tied ranks using external data

Robert de Vries

Join Date: Sep 2022
Posts: 8

Amending tied ranks using external data

01 Mar 2024, 05:42

I have a dataset which ranks occupations by a given characteristic. In this list of occupations, some are tied on this characteristic and therefore have the same rank position. However, I have external data that allows me to break these ties for each pair of occupations within a tied rank.

The data currently look like this:

occupation	rank	pair	win
Company CEO	1	Medical doctorCompany CEO	Medical doctor
Medical doctor	1
Company senior executive	2
PR executive	3	PR executivePriest	PR executive
Veterinarian	3	PhysicistVeterinarian	Physicist
Priest	3	PriestPhysicist	Priest
Physicist	3	PR executiveVeterinarian	PR executive

The occupation and rank variables should be self-explanatory. 'pair' is a concatenated string variable indicating a 'pair' of occupations that share the same rank. 'win' indicates the winner of the pairwise contrast indicated by 'pair'.

You can see that:
- CEO and medical doctor are tied for rank 1, but the 'pair' and 'win' columns show that when they are matched against each other, 'medical doctor' wins.
- PR exec, Vet, priest, and physicist are tied for rank 3, but the pairwise contrasts indicate that PR executive->priest->physicist->vet (assuming transitivity)

I would like to end up with a dataset that looks like the following - i.e. with the rank ties broken using the data from the pair and win variables:

occupation	rank
Company CEO	2
Medical doctor	1
Company senior executive	3
PR executive	4
Veterinarian	7
Priest	5
Physicist	6

Can anyone help?

Tags: None

Nick Cox

Join Date: Mar 2014

Posts: 35211
#2

01 Mar 2024, 06:52

If occupation is a string variable then define value labels using the mapping you have and follow with encode.

Code:

help label help encode

If occupation is somehow numeric, then you need to recode.
Comment
Robert de Vries

Join Date: Sep 2022

Posts: 8
#3

01 Mar 2024, 07:39

Hi Nick,

Thanks for the quick reply, but I'm afraid I don't follow. If by 'mapping you have' you mean the second table, then this is where I want to end up. I created this manually from a subset of my data to illustrate what I want my dataset to look like after whatever process I need to follow to get there. My real dataset has 100 occupations with multiple different tied ranks with different number of occupations tied at that rank.

EDIT: I think that perhaps the misunderstanding comes from the way I've posted my data (as a table). Just to be clear, the first and second tables are supposed to be data. I.e. each row is an observation and each column is a variable.

Last edited by Robert de Vries; 01 Mar 2024, 07:42.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35211
#4

01 Mar 2024, 08:34

In turn it is now evident that I don't yet understand what you're driving at. Sorry.
Comment
Robert de Vries

Join Date: Sep 2022

Posts: 8
#5

01 Mar 2024, 10:51

Hi Nick,

Sorry - I think I've not explained myself very well. Perhaps it would help if I went back a few steps - because the most sensible response to what I posted above may well be "well I wouldn't start from here".

My original dataset is the results of a set of pairwise comparisons between a list of 99 occupations. It looks like this (using datex this time):

Code:

* Example generated by -dataex-. For more info, type help dataex clear input str27(occupation1 occupation2 p1) "Actor" "Administrative assistant" "Actor" "Aircraft pilot" "Actor" "Aircraft pilot" "Architect" "Actor" "Architect" "Artist" "Actor" "Actor" "Auto mechanic" "Actor" "Actor" end

Occupations 1 and 2 are the occupation pair being contrasted. P1 is the winning occupation. Each occupation is pair is compared only once.

The final output I want to produce is a ranked list of occupations according to their number of 'wins'. This is pretty straightforward to do, but produces a number of tied occupations. A dataset containing the ranked list of occupations looks like this:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input str27 occupation long wins float rank_no_gaps "Company CEO" 97 1 "Medical doctor" 97 1 "Company senior executive" 96 2 "Aircraft pilot" 92 3 "Lawyer" 92 3 "Elected official" 91 4 "Police captain" 91 4 end

You can see that Company CEO and Medical doctor are tied for rank 1, and aircraft pilot and lawyer are tied for rank 3.

I would like to resolve these ties (where possible) using data from the pairwise contrasts. So, for example, I can see from the original data that when company CEO and medical doctor are compared, medical doctor wins. I would therefore like to generate a new ranking in which medical doctor is rank 1, company CEO is rank 2, and then the rest of the ranking proceeds from there.

After going through whole bunch of (probably unnecessarily convoluted) data manipulation. I have arrived at the following dataset (which is a precursor to the dataset I presented in the original post):

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float rank_no_gaps str27(occupation1 occupation2 occupation3 occupation4) str54 pair1 str27 wp1 str54 pair2 str27 wp2 str54 pair3 str27 wp3 str54 pair4 str27 wp4 str54 pair5 str27 wp5 str54 pair6 str27 wp6 1 "Medical doctor" "Company CEO" "" "" "Medical doctorCompany CEO" "Medical doctor" "" "" "" "" "" "" "" "" "" "" 2 "Company senior executive" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" 3 "Aircraft pilot" "Lawyer" "" "" "Aircraft pilotLawyer" "Aircraft pilot" "" "" "" "" "" "" "" "" "" "" end

This is a dataset that is wide on rank position. So within each rank position there is:
- A variable for each occupation that shares that rank
- A variable identifying each pairwise comparison within that rank (which comprises the concatenated strings of the two contrasted occupations) ('pairx'). There are 6 possible pair variables because up to 4 occupations can share a rank.
- a variable indicating the winner of the corresponding pairwise comparison ('wpx')

So for example, the first row of data shows that:
- rank 1 is shared by medical doctor and company ceo
- pair1 concatenates these strings (pair2+ are empty because there are only two occupations to compare at this rank)
- wp1 indicates that medical doctor wins against CEO in this comparison.

Starting from here, I would like to end up with a dataset that gives a newly ranked list of occupations, incorporating the information on how occupations are ranked within ranks.

I hope this makes more sense!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35211
#6

02 Mar 2024, 05:38

Thanks for this extra explanation, which was a lot of work. Sorry to disappoint again, but that's a messier problem than I am willing to try to engage with.

Completely open to all people here as always...
Comment
Robert de Vries

Join Date: Sep 2022

Posts: 8
#7

04 Mar 2024, 03:15

No worries at all Nick. Hopefully someone else here has some insights, but I recognise that it is quite an open question so is a lot to ask!
Comment

Announcement

Amending tied ranks using external data

Comment

Comment

Comment

Comment

Comment

Comment