Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is there a way to sort rows of summarize or mean output by the value of the mean

    I would like to figure out a workaround so that the output of the summarize or mean procedures are sorted by the magnitude of the mean.

    E.g., suppose I have binary (0 / 1) indicators for individuals as white, black, asian, Hispanic, native_american, mixed_race. And these are not mutually exclusive because someone can score "1" on two or more categories (if they were, tabulate with the sort option would do the trick).

    In different samples the order of the group size differs, so we can't order the varlist a priori from largest to smallest.

    Is there a standard procedure, or a simple workaround using saved results that would do the trick?




  • #2
    Here's a quick hack. It assumes that the number of variables concerned is less than or equal to the number of observations in memory. That could be relaxed.

    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . sortmean price-foreign
    foreign gear_ratio headroom rep78 trunk mpg turn length displacement weight price
    
    . summarize `sorted'
    
        Variable |       Obs        Mean    Std. Dev.       Min        Max
    -------------+--------------------------------------------------------
         foreign |        74    .2972973    .4601885          0          1
      gear_ratio |        74    3.014865    .4562871       2.19       3.89
        headroom |        74    2.993243    .8459948        1.5          5
           rep78 |        69    3.405797    .9899323          1          5
           trunk |        74    13.75676    4.277404          5         23
    -------------+--------------------------------------------------------
             mpg |        74     21.2973    5.785503         12         41
            turn |        74    39.64865    4.399354         31         51
          length |        74    187.9324    22.26634        142        233
    displacement |        74    197.2973    91.83722         79        425
          weight |        74    3019.459    777.1936       1760       4840
    -------------+--------------------------------------------------------
           price |        74    6165.257    2949.496       3291      15906
    Notice that you don't need to retype the names, just use the local macro sorted which has appeared in the program space.

    This is the code:

    Code:
    program sortmean, sortpreserve
        version 8.2
        syntax varlist(numeric) [if] [in]
    
        marksample touse
        qui count if `touse'
        if r(N) == 0 error 2000
    
        local J = 0
        tempvar which mean
    
        quietly {
            gen `which' = ""
            gen `mean' = .
    
            foreach v of local varlist {
                local ++J
                su `v' if `touse', meanonly
                replace `which' = "`v'" in `J'
                replace `mean' = r(mean) in `J'
            }
        }
    
        sort `mean'
    
        forval j=1/`J' {
            local sorted `sorted' `=`which'[`j']'
        }
    
        di "`sorted'"
        c_local sorted "`sorted'"
    end

    Comment

    Working...
    X