Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rank, track option without "skipping" next number

    Dear Statalist,

    I use Stata 11.2 on a Windows device and have a dataset similar to this one:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 v1 int v2
    "A" 150
    "A" 200
    "A" 100
    "A" 150
    "A" 100
    "B"  80
    "B"  70
    "B"  50
    "B"  80
    "B"  80
    "B"  90
    "B"  40
    "B"  60
    "C" 400
    "C" 100
    "C"  50
    "C"  50
    "C" 400
    "C" 100
    "C" 150
    end
    I want to create a variable that ranks the observation of another variable (v2) in descending order. There shall be a different ranking for each group of observations with identical values of a third variable (v1) :
    Code:
    egen v3 = rank(-v2), by(v1) track
    The crucial point is, that I want ties to have the same rank number. However, the next observation should have the following number without skipping the next number as one would usually do (and Stata does as well) in this case. After
    Code:
    gsort v1 -v2
    I get this result:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 v1 int v2 float v3
    "A" 200 1
    "A" 150 2
    "A" 150 2
    "A" 100 4
    "A" 100 4
    "B"  90 1
    "B"  80 2
    "B"  80 2
    "B"  80 2
    "B"  70 5
    "B"  60 6
    "B"  50 7
    "B"  40 8
    "C" 400 1
    "C" 400 1
    "C" 150 3
    "C" 100 4
    "C" 100 4
    "C"  50 6
    "C"  50 6
    end

    Instead the observations should look like that:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 v1 int v2 byte v3
    "A" 200 1
    "A" 150 2
    "A" 150 2
    "A" 100 3
    "A" 100 3
    "B"  90 1
    "B"  80 2
    "B"  80 2
    "B"  80 2
    "B"  70 3
    "B"  60 4
    "B"  50 5
    "B"  40 6
    "C" 400 1
    "C" 400 1
    "C" 150 2
    "C" 100 3
    "C" 100 3
    "C"  50 4
    "C"  50 4
    end
    My first approach was to solve that problem with the help of a fourth variable. The idea was to create a variable with the value 1 for all observations. After that, I wanted to replace v4 of every observation with an increase of v3 by the previous value of v4+1 and every observation where this condition doesn’t hold with the previous value of v4 in case the previous value is larger and has the same value for the group variable v1. Therefore, I wrote this code:
    Code:
    gen v4=1
    replace v4=v4[_n-1]+1 if v3>v3[_n-1] & _n>1
    replace v4=v4[_n-1] if v4<v4[_n-1] & v3[_n]==v3[_n-1] & _n>1
    I hoped that for observation 2 condition 1 holds (v4[_n-1]+1=2). After the change of observation 2 condition 2 holds for the next observation so it becomes 2 as well. Thereafter condition 1 should be fulfilled for observation 4 so it becomes v4[_n-1]+1=3 and so on.

    That didn’t work since obviously Stata doesn’t perform the two commands one observation after the other but first command 1 for all observations, then command 2 for all observations. Since condition 1 depends on previous observations, the results for later observations changes if command 2 isn’t performed on the preceding observations, which leaves me with these values:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 v1 int v2 float(v3 v4)
    "A" 200 1 1
    "A" 150 2 2
    "A" 150 2 2
    "A" 100 4 2
    "A" 100 4 2
    "B"  90 1 1
    "B"  80 2 2
    "B"  80 2 2
    "B"  80 2 2
    "B"  70 5 2
    "B"  60 6 3
    "B"  50 7 4
    "B"  40 8 5
    "C" 400 1 1
    "C" 400 1 1
    "C" 150 3 2
    "C" 100 4 3
    "C" 100 4 3
    "C"  50 6 2
    "C"  50 6 2
    end
    I tried to solve this problem with the following foreach command:
    Code:
    foreach x in v2{
    replace v4=v4[_n-1]+1 if v3>v3[_n-1] & _n>1
    replace v4=v4[_n-1] if v4<v4[_n-1] & v3[_n]==v3[_n-1] & _n>1
    }
    Unfortunately, the results didn’t change so I assume that Stata still performs command 1 for all observations before starting with command 2.

    I would like to know, how I can tell Stata to perform both commands immediately in succession for one observation before continuing with the next one. Alternatively, I’d be also grateful for hints on any different way to solve the ranking problem.

    I hope, I expressed my issue clear enough and apologize for any mistakes in my English.

    Kind regards

    Ingo

  • #2
    Thanks for your clear question and data examples.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 v1 int v2 byte v3
    "A" 200 1
    "A" 150 2
    "A" 150 2
    "A" 100 3
    "A" 100 3
    "B"  90 1
    "B"  80 2
    "B"  80 2
    "B"  80 2
    "B"  70 3
    "B"  60 4
    "B"  50 5
    "B"  40 6
    "C" 400 1
    "C" 400 1
    "C" 150 2
    "C" 100 3
    "C" 100 3
    "C"  50 4
    "C"  50 4
    end
    
    bysort v1 : gen wanted = sum(v2 != v2[_n-1])
    
    assert wanted == v3
    
    list, sepby(v1)  
    
         +------------------------+
         | v1    v2   v3   wanted |
         |------------------------|
      1. |  A   200    1        1 |
      2. |  A   150    2        2 |
      3. |  A   150    2        2 |
      4. |  A   100    3        3 |
      5. |  A   100    3        3 |
         |------------------------|
      6. |  B    90    1        1 |
      7. |  B    80    2        2 |
      8. |  B    80    2        2 |
      9. |  B    80    2        2 |
     10. |  B    70    3        3 |
     11. |  B    60    4        4 |
     12. |  B    50    5        5 |
     13. |  B    40    6        6 |
         |------------------------|
     14. |  C   400    1        1 |
     15. |  C   400    1        1 |
     16. |  C   150    2        2 |
     17. |  C   100    3        3 |
     18. |  C   100    3        3 |
     19. |  C    50    4        4 |
     20. |  C    50    4        4 |
         +------------------------+


    Previous discussions include

    https://www.stata.com/statalist/arch.../msg00596.html

    https://stackoverflow.com/questions/...roups-in-stata
    Last edited by Nick Cox; 08 Jun 2018, 01:55.

    Comment


    • #3
      Great, that works perfectly well. Thanks a lot!

      Comment

      Working...
      X