Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quantile regression with Ordinal dependent variables

    Hi, I was trying quantile regression with Stata 12 qreg2 (or xi: qreg2) command. My dependent variable is ordinal with a value range 1-7 and some independent variables are continuous (say income, age) and some others are ordinal (say sex, marital status and job status). While segregating quantiles (say n=4), Stata returned only three quantiles (n=1,3 and 4). 1st quantile took the observations with values 1-5, 3rd quantile took the observations with value 6 and the 4th took the observations with value 7. For unknown reason, there was no quantile (n=2). Moreover, the estimation was faulty as the 3rd and 4th quantile had no variation in the value of dependent variable.

    Somebody advised me to use R but I am not conversant with R yet. Could anybody advise if I can do quantile regression with ordinal variables with Stata 12? If yes, how? Thanks n advance for an immediate advice.

  • #2
    Hi there,

    First of all, thank you for using -qreg2- :-)

    Standard quantile regression does not work when the dependent variable is discrete. So, I am not surprised that you get strange results; obviously just doing quantile regression with R will not help. There are methods to compute quantile regression with ordered data (M.J. Lee has a seminal paper on that), but these are unlikely to work well except in very large samples. You may try to use -qcount- (available from SSC), but that was designed to handle count data, not ordered data. For more information, see here http://www.tandfonline.com/doi/abs/1...0#.VSOOGvxXTCs

    All the best,

    Joao

    Comment


    • #3
      Joao gives good advice. However, it seems possible also that quantile regression is being used here when something more like ordinal logit fits the need.

      The assertion that the estimation is "faulty" almost certainly stems from a misunderstanding of quantile regression. It's entirely possible that the same level could be returned as different quantiles whenever the response is ordinal.

      Ujjwal: Please study the FAQ Advice, especially Section 6 (using full real names) and Section 12 (giving full details of what you did).

      Comment


      • #4
        Hi.
        "Standard quantile regression does not work when the dependent variable is discrete." ----> is this all along impossible ? say, for instance, for a logit. I am using a discrete choice model. Any work done using quantile for DCE ?
        Thanks

        Comment


        • #5
          Pedro: It is indeed possible to do quantile regression with binary data; actually, that was the first form of quantile regression that was described in the literature. However, quantile regression in that case cannot be performed using standard methods and the notorious maximum score estimator has to be used. Chuck Manski and Joel Horowitz are the main names in the field and you should be able to easily find their papers if you are interested.

          Ujjwal: Following up on Nick's comments, I also agree that you may be doing something wrong or misinterpreting something (or at least I was not able to understand what you are doing). However, if you try to use standard quantile regression with ordered data you are likely to find some strange (and meaningless) results, at least that is my experience.

          Comment


          • #6
            Thanks for all comments. Joao and Nick: Here are some snapshots of y variable:.

            tab y

            y | Freq. Percent Cum.
            -------------------+-----------------------------------
            1 | 242 0.89 0.89
            2 | 509 1.88 2.77
            3 | 1,622 5.99 8.76
            4 | 3,896 14.38 23.14
            5 | 9,285 34.27 57.40
            6 | 9,564 35.30 92.70
            7 | 1,978 7.30 100.00
            -------------------+-----------------------------------
            Total | 27,096 100.00

            sort y
            .
            xtile quantile = y, nquantiles(4)

            tab y quantile

            | 4 quantiles of y
            y | 1 3 4 | Total
            -------------------+---------------------------------+----------
            1 | 242 0 0 | 242
            2 | 509 0 0 | 509
            3 | 1,622 0 0 | 1,622
            4 | 3,896 0 0 | 3,896
            5 | 9,285 0 0 | 9,285
            6 | 0 9,564 0 | 9,564
            7 | 0 0 1,978 | 1,978
            -------------------+---------------------------------+----------
            Total | 15,554 9,564 1,978 | 27,096


            I did not understand where my second quantile n=2 has disappeared? Why the first quantile (n=1) has taken all 15,554 observations with a value of dv 1-5 ? I was wondering if I can treat discrete variable as continuous like what we usually do in simple OLS or fixed effect models.

            Comment


            • #7
              Your reply has nothing to do with any flavour of quantile regression, but here we go.

              The answer can be expressed in one very short word. It is ties.

              You asked xtile for 4 bins. With 27,096 observations, that implies approximately 6,774 in each. I say "approximately" even though the desideratum of equal frequencies would be satisfied exactly by 6,774 in each of 4 bins. However, that is only one detail.

              The other detail bites very hard here and it is that observations the same on the original variable must be assigned to the same bin .

              There is no 4 bin solution that matches equal frequencies at all closely.

              What xtile gave you back was 1 to 5 mapped to 1, which leaves 6 mapped to 3 and 7 mapped to 4. Basically xtile starts at the lowest value and aggregates until it has passed total frequency/number of bins.

              There are 4-bin solutions you might like better, notably (1,2,3,4), 5, 6, 7.

              Regardless of that, why bin here at all? You don't have much detail in a 7-level response. Throwing much of it away looks perverse. Binning more coarsely takes you even further away from a continuous response, so does seem very puzzling.

              Incidentally (trivially, if you like), n is distracting notation for labelling distinct quantiles.

              Comment

              Working...
              X