Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • problem with result variable generated using xtile

    Hello,

    I am trying to generate an income quantile variable using xtile. For some reason, the new variable has less levels with actual values than what I specify, and one level is all missing.

    For example, if a generate the new variablw specifying n(5), I get 4 levels with values and one with all missing values. The original variable is a continous variable with lots of zeros.

    sum Y_total

    Variable | Obs Mean Std. Dev. Min Max
    -------------+--------------------------------------------------------
    Y_total | 61665 4868.114 17333.08 0 199998


    Here is an example of what is happening with my new variables:

    . xtile Y_quart = Y_total, nq(4)
    . tab Y_quart
    4 quantiles |
    of Y_total | Freq. Percent Cum.
    ------------+-----------------------------------
    1 | 33,457 54.26 54.26
    3 | 13,489 21.87 76.13
    4 | 14,719 23.87 100.00
    ------------+-----------------------------------
    Total | 61,665 100.00


    The only way it seems to not-do this is if I set n(3), which generates:

    xtile Y_quart2 = Y_total, nq(3)
    . tab Y_quart2
    3 quantiles |
    of Y_total | Freq. Percent Cum.
    ------------+-----------------------------------
    1 | 33,457 54.26 54.26
    2 | 9,253 15.01 69.26
    3 | 18,955 30.74 100.00
    ------------+-----------------------------------
    Total | 61,665 100.00

    Do you have any ideas what is causing this? I am using Stata SE 11.2

    THANKS!!!

  • #2
    This is most likely correct behaviour due to all of the zeroes. See the technical note in the PDF documentation for xtile, eg page 8 of http://www.stata.com/manuals13/dpctile.pdf

    Simple example:
    Code:
    clear
    set obs 100
    gen x=0
    replace x=_n in 51/100
    xtile q=x, n(4)
    tab q

    Comment


    • #3
      This is a common complaint with xtile, but the constraint that the same values must get put into the same quantile band can bite very hard.

      ​In general using a different convention about the inequalities may give a better answer, but it sounds as if that's unlikely in this case. There is some detailed discussion within http://www.stata-journal.com/article...article=pr0054

      Comment


      • #4
        Thanks for the replies.

        Comment

        Working...
        X