Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Simple percentiles using wide dataset

    Hello,

    I'm not sure why I am having such trouble creating a variable corresponding to percentiles. I have a dataset that looks like the dataex example below with a person ID (pid) and a number (adheresum). I am trying to assign a percentile to each ID based on the value of adheresum. I tried pctile pct=adheresum but this just ended up with all missing values for all but ID=1. What am I missing here?

    Thank you very much!

    Sarah




    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(pid adheresum)
      1 120
      2  28
      3  44
      4  13
      5 112
      6 145
      7   4
      8  84
      9  68
     10 143
     11  31
     12   4
     13 164
     14  46
     15 136
     16  44
     17  35
     18  15
     19  87
     20 140
     21 157
     22 158
     23   5
     24 162
     25  88
     26  18
     27  93
     28  45
     29  11
     30  90
     31  12
     32 177
     33  81
     34 107
     35 105
     36 148
     37  82
     38 124
     39  49
     40  30
     41  85
     42   8
     43 122
     44  78
     45 101
     46   9
     47  69
     48 154
     49  18
     50  56
     51 128
     52 164
     53  34
     54 131
     55 102
     56 127
     57 108
     58 160
     59 133
     60 127
     61  97
     62 119
     63 112
     64  90
     65  48
     66 166
     67 156
     68 123
     69 112
     70 150
     71  34
     72 153
     73  56
     74 164
     75  81
     76 129
     77  95
     78 122
     79 107
     80 148
     81  69
     82  69
     83  48
     84  50
     85  11
     86  67
     87  70
     88 152
     89  51
     90  77
     91 161
     92 171
     93  12
     94  96
     95 111
     96 151
     97  90
     98  96
     99 160
    100 135
    end

  • #2
    That is not what -pctile- does. -pctile- calculates the values of adheresum corresponding to the percentiles specified in the -nquantiles()- option and saves them in the first several observations of the data set. As you didn't specify anything for -nquantiles-, the default value is 2. Consequently, -pctile- just calculates the median value of adheresum and sticks it in the first observation.

    What you want is done by -xtile-
    Code:
    xtile percentile_rank = adheresum, nq(100)

    Comment


    • #3
      Here's another take. The story is at https://www.stata.com/support/faqs/s...ing-positions/ Naturally, multiplication by 100 is a choice too.

      Code:
      egen rank = rank(adheresum)
      egen count = count(adheresum)
      gen pp = (rank - 0.5) / count 
      
      sort adheresum 
      
      list adheresum pp in 1/21, sepby(adheresum)
      
          +-----------------+
           | adhere~m     pp |
           |-----------------|
        1. |        4    .01 |
        2. |        4    .01 |
           |-----------------|
        3. |        5   .025 |
           |-----------------|
        4. |        8   .035 |
           |-----------------|
        5. |        9   .045 |
           |-----------------|
        6. |       11    .06 |
        7. |       11    .06 |
           |-----------------|
        8. |       12    .08 |
        9. |       12    .08 |
           |-----------------|
       10. |       13   .095 |
           |-----------------|
       11. |       15   .105 |
           |-----------------|
       12. |       18    .12 |
       13. |       18    .12 |
           |-----------------|
       14. |       28   .135 |
           |-----------------|
       15. |       30   .145 |
           |-----------------|
       16. |       31   .155 |
           |-----------------|
       17. |       34    .17 |
       18. |       34    .17 |
           |-----------------|
       19. |       35   .185 |
           |-----------------|
       20. |       44     .2 |
       21. |       44     .2 |
           +-----------------+
      
      . 
      
      
      quantile adheresum, scheme(s1color) rlopts(lc(none))

      Comment

      Working...
      X