Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • rank observations by value frequency

    Could anyone please suggest a way to rank groups by the frequency of values in another variable.
    In the mock example below, how could I obtain the variable "rankfreq"?

    Code:
    clear
    input str1 to_rankfreq    float rankfreq
    "a" 1
    "a" 1
    "a" 1
    "b" 2
    "b" 2
    "c" 3
    end

  • #2
    There are various small questions here, notably what you do want to happen if categories tie on frequency.

    This works for your data example:

    NB I see this as a grouping problem, not a ranking problem in the sense of Stata's rank functions in egen -- they could be used for this problem, but I think there are more direct solutions.


    Code:
    clear
    input str1 to_rankfreq    float rankfreq
    "a" 1
    "a" 1
    "a" 1
    "b" 2
    "b" 2
    "c" 3
    end
    
    bysort to : gen negfreq = -_N 
    egen wanted = group(negfreq to) 
    
    list, sepby(to)
    
         +----------------------------------------+
         | to_ran~q   rankfreq   negfreq   wanted |
         |----------------------------------------|
      1. |        a          1        -3        1 |
      2. |        a          1        -3        1 |
      3. |        a          1        -3        1 |
         |----------------------------------------|
      4. |        b          2        -2        2 |
      5. |        b          2        -2        2 |
         |----------------------------------------|
      6. |        c          3        -1        3 |
         +----------------------------------------+

    Comment

    Working...
    X