Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Percentile (decile) calculation groupwise based on a particular variable's value

    So I have a variable called "group" that takes several string values, its not unique thus many values get repeated. I want to calculate and assign the decile value for the variable "ownership" in a new column, but I want to do it for each "group" value.
    For example, for the subsample with value AB32T for "group", I get want the decile calculations to be done for ownership. Then similarly for another value for "group", and so on for all value of "group".
    So I wrote the code

    sort group
    by group: pctile ownership_decile = ownership, nq(10)
    This gives me an error that pctile may not be combined with by

    I dont know how to proceed with it without a very long and arduous set of loops and appending. Can someone please help? Thank you.

  • #2
    The code for this is somewhat complicated, and you do not show example data to work with. So I will illustrate the approach using auto.dta. As that dataset is rather small, I show the approach for calculating quintiles instead of deciles.
    Code:
    clear*
    sysuse auto
    
    levelsof rep78, local(groups)
    
    frame create quantiles int (group quantile) float value
    foreach g of local groups {
        pctile  result = price if rep78 == `g', nq(5)
        forvalues i = 1/4 {
            frame post quantiles (`g') (`i') (result[`i'])
        }
        drop result
    }
    
    frame quantiles: list, noobs sepby(group)
    And at the end of this code, frame quantiles contains a data set consisting of the group, the quantile (quintile in this case, decile in your case) and the value.

    Comment


    • #3
      Do you want decile values, i.e. 9 percentiles 10(10)90 of ownership? Or decile bins. the 10 bins delimited by those deciles?

      The difference is shown as follows:

      Code:
      sysuse auto, clear
      set scheme stcolor
      
      xtile mpg_bin=mpg, nq(10)
      
      pctile pctile_mpg = mpg, nq(10)
      levelsof pctile_mpg, local(levels)
      
      quantile mpg, mla(mpg_bin) mlabpos(0) yli(`levels', lw(thin) lc(stc2) lp(solid)) ms(none) rlopts(lc(none))
      Use some different graphics scheme and line colour if you are using Stata <18.

      We need to be clear on this before we can advise on code.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        Do you want decile values, i.e. 9 percentiles 10(10)90 of ownership? Or decile bins. the 10 bins delimited by those deciles?

        The difference is shown as follows:

        Code:
        sysuse auto, clear
        set scheme stcolor
        
        xtile mpg_bin=mpg, nq(10)
        
        pctile pctile_mpg = mpg, nq(10)
        levelsof pctile_mpg, local(levels)
        
        quantile mpg, mla(mpg_bin) mlabpos(0) yli(`levels', lw(thin) lc(stc2) lp(solid)) ms(none) rlopts(lc(none))
        Use some different graphics scheme and line colour if you are using Stata <18.

        We need to be clear on this before we can advise on code.
        I need the bin value. Essentially I want to create bins with equal number of data points, such that the bin 1 has the lowest values of ownership, whereas bin 10 has the highest values of ownership: All this for each group number. In a way like portfolio analysis.
        Last edited by Dev Irani; 16 Aug 2023, 14:51.

        Comment


        • #5
          There are various extensions of xtile to groupwise calculations. That most familiar to me is the xtile() function for egen in egenmore from SSC.

          Comment


          • #6
            Code:
            gen bin = .
            levelsof group, local(groups)
            foreach g of local groups {
                xtile temp = ownership if group == `g', nq(10)
                replace bin = temp if group == `g'
                drop temp
            }
            will do that. If this is a calculation you will do frequently in your work, then I recommend you install Nick Cox's -egenmore- package, from SSC. Then you can do it as a one-liner:
            Code:
            bysort ownership: egen bin = xtile(ownership), nq(10)

            Comment


            • #7
              Quick question on the one liner code Clyde Schechter.
              Code:
               
               bysort ownership: egen bin = xtile(ownership), nq(10)
              It doest have "group" anywhere in the code. So how does it do it groupwise? By any chance did you mean,

              Code:
               
               bysort group: egen bin = xtile(ownership), nq(10)

              Comment


              • #8
                Yes, sorry, the second code is exactly what I meant.

                Comment


                • #9
                  Thank you so much Clyde Schechter and Nick Cox . This command and package resolves my queries.

                  Comment

                  Working...
                  X