Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Amending tied ranks using external data

    I have a dataset which ranks occupations by a given characteristic. In this list of occupations, some are tied on this characteristic and therefore have the same rank position. However, I have external data that allows me to break these ties for each pair of occupations within a tied rank.

    The data currently look like this:

    occupation rank pair win
    Company CEO 1 Medical doctorCompany CEO Medical doctor
    Medical doctor 1
    Company senior executive 2
    PR executive 3 PR executivePriest PR executive
    Veterinarian 3 PhysicistVeterinarian Physicist
    Priest 3 PriestPhysicist Priest
    Physicist 3 PR executiveVeterinarian PR executive

    The occupation and rank variables should be self-explanatory. 'pair' is a concatenated string variable indicating a 'pair' of occupations that share the same rank. 'win' indicates the winner of the pairwise contrast indicated by 'pair'.

    You can see that:
    - CEO and medical doctor are tied for rank 1, but the 'pair' and 'win' columns show that when they are matched against each other, 'medical doctor' wins.
    - PR exec, Vet, priest, and physicist are tied for rank 3, but the pairwise contrasts indicate that PR executive->priest->physicist->vet (assuming transitivity)

    I would like to end up with a dataset that looks like the following - i.e. with the rank ties broken using the data from the pair and win variables:

    occupation rank
    Company CEO 2
    Medical doctor 1
    Company senior executive 3
    PR executive 4
    Veterinarian 7
    Priest 5
    Physicist 6

    Can anyone help?



  • #2
    If occupation is a string variable then define value labels using the mapping you have and follow with encode.

    Code:
    help label 
    help encode
    If occupation is somehow numeric, then you need to recode.



    Comment


    • #3
      Hi Nick,

      Thanks for the quick reply, but I'm afraid I don't follow. If by 'mapping you have' you mean the second table, then this is where I want to end up. I created this manually from a subset of my data to illustrate what I want my dataset to look like after whatever process I need to follow to get there. My real dataset has 100 occupations with multiple different tied ranks with different number of occupations tied at that rank.

      EDIT: I think that perhaps the misunderstanding comes from the way I've posted my data (as a table). Just to be clear, the first and second tables are supposed to be data. I.e. each row is an observation and each column is a variable.
      Last edited by Robert de Vries; 01 Mar 2024, 07:42.

      Comment


      • #4
        In turn it is now evident that I don't yet understand what you're driving at. Sorry.

        Comment


        • #5
          Hi Nick,

          Sorry - I think I've not explained myself very well. Perhaps it would help if I went back a few steps - because the most sensible response to what I posted above may well be "well I wouldn't start from here".

          My original dataset is the results of a set of pairwise comparisons between a list of 99 occupations. It looks like this (using datex this time):

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str27(occupation1 occupation2 p1)
          "Actor"          "Administrative assistant" "Actor"         
          "Aircraft pilot" "Actor"                    "Aircraft pilot"
          "Architect"      "Actor"                    "Architect"     
          "Artist"         "Actor"                    "Actor"         
          "Auto mechanic"  "Actor"                    "Actor"         
          end
          Occupations 1 and 2 are the occupation pair being contrasted. P1 is the winning occupation. Each occupation is pair is compared only once.

          The final output I want to produce is a ranked list of occupations according to their number of 'wins'. This is pretty straightforward to do, but produces a number of tied occupations. A dataset containing the ranked list of occupations looks like this:


          Code:
          * Example generated by -dataex-.    For more info, type    help    dataex
          clear
          input str27 occupation long wins    float rank_no_gaps
          "Company CEO"              97 1
          "Medical doctor"           97 1
          "Company senior executive" 96 2
          "Aircraft pilot"           92 3
          "Lawyer"                   92 3
          "Elected official"         91 4
          "Police captain"           91 4
          end

          You can see that Company CEO and Medical doctor are tied for rank 1, and aircraft pilot and lawyer are tied for rank 3.

          I would like to resolve these ties (where possible) using data from the pairwise contrasts. So, for example, I can see from the original data that when company CEO and medical doctor are compared, medical doctor wins. I would therefore like to generate a new ranking in which medical doctor is rank 1, company CEO is rank 2, and then the rest of the ranking proceeds from there.

          After going through whole bunch of (probably unnecessarily convoluted) data manipulation. I have arrived at the following dataset (which is a precursor to the dataset I presented in the original post):


          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input float rank_no_gaps str27(occupation1 occupation2 occupation3 occupation4) str54 pair1 str27 wp1 str54 pair2 str27 wp2 str54    pair3    str27    wp3 str54    pair4    str27    wp4 str54    pair5    str27    wp5 str54    pair6    str27    wp6
          1 "Medical doctor"           "Company CEO" "" "" "Medical doctorCompany CEO" "Medical doctor" "" "" "" "" "" "" "" "" "" ""
          2 "Company senior executive" ""            "" "" ""                          ""               "" "" "" "" "" "" "" "" "" ""
          3 "Aircraft pilot"           "Lawyer"      "" "" "Aircraft pilotLawyer"      "Aircraft pilot" "" "" "" "" "" "" "" "" "" ""
          end
          This is a dataset that is wide on rank position. So within each rank position there is:
          - A variable for each occupation that shares that rank
          - A variable identifying each pairwise comparison within that rank (which comprises the concatenated strings of the two contrasted occupations) ('pairx'). There are 6 possible pair variables because up to 4 occupations can share a rank.
          - a variable indicating the winner of the corresponding pairwise comparison ('wpx')

          So for example, the first row of data shows that:
          - rank 1 is shared by medical doctor and company ceo
          - pair1 concatenates these strings (pair2+ are empty because there are only two occupations to compare at this rank)
          - wp1 indicates that medical doctor wins against CEO in this comparison.

          Starting from here, I would like to end up with a dataset that gives a newly ranked list of occupations, incorporating the information on how occupations are ranked within ranks.

          I hope this makes more sense!

          Comment


          • #6
            Thanks for this extra explanation, which was a lot of work. Sorry to disappoint again, but that's a messier problem than I am willing to try to engage with.

            Completely open to all people here as always...

            Comment


            • #7
              No worries at all Nick. Hopefully someone else here has some insights, but I recognise that it is quite an open question so is a lot to ask!

              Comment

              Working...
              X