Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    xtile() isn't ranking; it's binning. But observations with the same value will always be assigned to the same bin.

    Other way round, this is a common question, even when the number of non-missing values is a multiple of the number of bins:

    I asked for quantile-based bins. Why is the number of observations different in each?

    The answer is always ties (plus the small print for non-multiples, as when 20 divided into 3 can at best be some permutation of 7 6 6).

    More at

    https://www.stata-journal.com/articl...article=dm0095 Section 6

    https://www.stata-journal.com/articl...article=pr0054 Section 4

    Comment


    • #17
      Dear all,

      I have a question following the thread.

      I would like to calculate the means of multiples variables based on the quintile of a particular variable. For example, I have tried the following code, but it did not work.

      xtile RDS_Q5 = RDS if RDS !=0 , nq(5)
      mean EARN5 RDS ADEX if RDS != 0 by(RDS_Q5)

      The idea is to calculate the means of variables EARN5, RDS, ADEX, if RDS is different from 0 and the means are categorized by the quintiles of RDS.

      Thanks,
      Bao

      Comment


      • #18
        You need a comma and the over() option.

        Code:
        . sysuse auto, clear
        (1978 automobile data)
        
        . xtile qmpg=mpg if foreign, nq(5)
        
        . mean price weight if foreign, over(qmpg)
        
        Mean estimation                              Number of obs = 22
        
        ---------------------------------------------------------------
                      |       Mean   Std. err.     [95% conf. interval]
        --------------+------------------------------------------------
         c.price@qmpg |
                   1  |     9258.6   1506.868      6124.897     12392.3
                   2  |     6417.8   632.5559      5102.328    7733.272
                   3  |       6432   903.4948       4553.08     8310.92
                   4  |   4129.667   186.9094      3740.967    4518.366
                   5  |       4383   389.0246      3573.979    5192.021
                      |
        c.weight@qmpg |
                   1  |       2900   179.0531      2527.639    3272.361
                   2  |       2296   124.2417      2037.625    2554.375
                   3  |       2218    127.648      1952.542    2483.458
                   4  |   1856.667   64.89307      1721.714    1991.619
                   5  |     2077.5   41.30678      1991.598    2163.402
        ---------------------------------------------------------------

        Comment


        • #19
          Originally posted by Nick Cox View Post
          Fernando meant

          Code:
          egen quant=xtile(x), n(4) by(year)
          egenmore is the package name only; the package includes a ragbag of more functions for egen.
          egenmore cannot incorporate weight when creating quantiles by group. Is there any command that can overcome this (except for using loop)?

          Comment


          • #20
            You can IIRC use cumul to produce a cumulative distribution function using weights. Choose options carefully. Then for example quintile bins will be defined by ceil(5 * cdf) where cdf holds the cumulative probability. In general, watch out; quantile bins are most unlikely to hold even roughly equal frequencies if they are based on weights.

            Comment

            Working...
            X