Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deciles after Carhart

    Hello!

    So I am trying to form 10 deciles like Carhart in his 'on persistence of mutual fund perfomance' study, but as a beginner I have no idea where to start

    fundnr quarter year mktrf smb hml rf mf_u_mnret
    1 210 2012 2.55 .41 1.28 .01 .7284
    1 206 2011 -7.59 -3.48 -1.46 0 -11.31222
    1 214 2013 3.77 2.94 -1.18 0 4.459965
    1 215 2013 3.12 1.24 .24 0 1.318621
    1 206 2011 -2.36 -1.31 -1.23 0 -6

    the database contains more than 3000funds, while one fund can have multiple occurence (based on different month/year and so on)
    What would be the best approach to form 10 Deciles, while Decile 1 contains the top 30% of funds and Decile 10 the worst 30%. He continues to divide the Topdecile 1 and Bottomdecile 10 into 3 subgroups: 1A, 1B, 1C and 10A, 10B and 10C
    Thank you in advance!

  • #2
    Decile bins are 10 in number and usually are intended to contain 10% of the observations each. If your top bin 1 contains 30% and your bottom bin 10 also contains 30%, what are your rules for the other 8 bins?

    Disclaimer: Almost all of the most active people answering questions are not economists, so assuming that people know what "Carhart" means is optimistic.
    Last edited by Nick Cox; 25 Mar 2022, 12:06.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      Decile bins are 10 in number and usually are intended to contain 10% of the observations each. If your top bin 1 contains 30% and your bottom bin 10 also contains 30%, what are your rules for the other 8 bins?

      Disclaimer: Almost all of the most active people answering questions are not economists, so assuming that people know what "Carhart" means is optimistic.
      Im sorry, I misread. The top decile 1A contains top thirtieth of funds and the bottom decile (10C) the worst thirtieth. Furthermore Carhart describes the deciles as: ,, Funds with the highest past one-year return comprise decile 1 and funds with the lowest comprise decile 10. Deciles 1 and 10 are further subdivided into thirds on the same measure.''

      Comment


      • #4
        Consider this. groups is from the Stata Journal.


        Code:
        .  webuse nlswork, clear
        (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
        
        . xtile decile = ln_wage, nq(10)
        
        . tab decile
        
                 10 |
          quantiles |
         of ln_wage |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  1 |      2,866       10.04       10.04
                  2 |      2,854       10.00       20.05
                  3 |      2,841        9.96       30.00
                  4 |      2,872       10.07       40.07
                  5 |      2,834        9.93       50.00
                  6 |      2,863       10.03       60.03
                  7 |      2,844        9.97       70.00
                  8 |      2,861       10.03       80.03
                  9 |      2,847        9.98       90.00
                 10 |      2,852       10.00      100.00
        ------------+-----------------------------------
              Total |     28,534      100.00
        
        . xtile decile1 = ln_wage if decile == 1, nq(3)
        
        . xtile decile10 = ln_wage if decile == 10, nq(3)
        
        . groups decile decile1 decile10, missing
        
          +-----------------------------------------------+
          | decile   decile1   decile10   Freq.   Percent |
          |-----------------------------------------------|
          |      1         1          .     958      3.36 |
          |      1         2          .     956      3.35 |
          |      1         3          .     952      3.34 |
          |      2         .          .    2854     10.00 |
          |      3         .          .    2841      9.96 |
          |-----------------------------------------------|
          |      4         .          .    2872     10.07 |
          |      5         .          .    2834      9.93 |
          |      6         .          .    2863     10.03 |
          |      7         .          .    2844      9.97 |
          |      8         .          .    2861     10.03 |
          |-----------------------------------------------|
          |      9         .          .    2847      9.98 |
          |     10         .          1     957      3.35 |
          |     10         .          2     945      3.31 |
          |     10         .          3     950      3.33 |
          +-----------------------------------------------+
        
        .  egen min = min(ln_wage), by(decile decile1 decile10)
        
        .  egen max = min(ln_wage), by(decile decile1 decile10)
        
        . groups decile decile1 decile10 min max, missing
        
          +---------------------------------------------------------------------+
          | decile   decile1   decile10        min        max   Freq.   Percent |
          |---------------------------------------------------------------------|
          |      1         1          .          0          0     958      3.36 |
          |      1         2          .   .8749049   .8749049     956      3.35 |
          |      1         3          .   1.072795   1.072795     952      3.34 |
          |      2         .          .   1.166271   1.166271    2854     10.00 |
          |      3         .          .   1.302621   1.302621    2841      9.96 |
          |---------------------------------------------------------------------|
          |      4         .          .   1.420644   1.420644    2872     10.07 |
          |      5         .          .   1.530275   1.530275    2834      9.93 |
          |      6         .          .   1.640646   1.640646    2863     10.03 |
          |      7         .          .   1.758959   1.758959    2844      9.97 |
          |      8         .          .   1.889294   1.889294    2861     10.03 |
          |---------------------------------------------------------------------|
          |      9         .          .   2.049198   2.049198    2847      9.98 |
          |     10         .          1   2.276241   2.276241     957      3.35 |
          |     10         .          2   2.384579   2.384579     945      3.31 |
          |     10         .          3   2.563135   2.563135     950      3.33 |
          +---------------------------------------------------------------------+
        
        .

        Comment


        • #5
          Code:
           
           egen max = min(ln_wage), by(decile decile1 decile10)

          should have been

          Code:
           
           egen max = max(ln_wage), by(decile decile1 decile10)
          but I think the principles all stand otherwise.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            Code:
            egen max = min(ln_wage), by(decile decile1 decile10)

            should have been

            Code:
            egen max = max(ln_wage), by(decile decile1 decile10)
            but I think the principles all stand otherwise.
            Thank you so much. What would be the best approch to get the monthly excess return of each decile. For example: monthly excess return for decile 7 is 0.64% , for decile 8 it is 0.48% and so on..
            Is there a more compact way or do I really have to generate each decile with 'gen decile1 = mf_u_mnret if decile == 1' and then 'mean decile1' ?

            Comment


            • #7
              Since the deciles are also based on the past one year return I tried this:

              rangestat (sum) cum_excess_return_prior_year = mf_u_mnret , by(fundnr) interval(year -1 -1)
              by year, sort: egen decile = xtile( cum_excess_return_prior_year), nq(10)
              gen decile1= mf_u_mnret if decile==1
              gen decile2= mf_u_mnret if decile==2
              (til decile10)
              and then: mean decile1,2..10

              Following problem that occurs: the means of the deciles are just slightly different and always positive which seems to be wrong

              And since there are funds with the same fundnr but different years(some have the same year but different months) there are duplicate excess returns for those funds within the same year but different months. Is there a function that deletes those?

              Comment


              • #8
                Sorry, but I don't follow what new questions you're asking in #6 and #7. As already hinted I am not an economist and I don't work with returns data. I have no idea what "monthly excess return" means.

                To back up, I understand that you want a division into decile bins, except that the top and bottom bin are each subdivided into three, so 3 + 8 + 3 = 14 bins.

                There are no doubt other ways to do it, but this is as easy as I can make it -- with a different dataset.


                Code:
                webuse nlswork, clear
                xtile decile = ln_wage, nq(10)
                xtile decile1 = ln_wage if decile == 1, nq(3)
                xtile decile10 = ln_wage if decile == 10, nq(3)
                egen wanted = group(decile decile1 decile10), label missing 
                
                egen mean = mean(ln_wage), by(wanted)
                
                tabdisp wanted, c(mean)
                
                ----------------------
                group(dec |
                ile       |
                decile1   |
                decile10) |       mean
                ----------+-----------
                    1 1 . |   .5510121
                    1 2 . |    .987896
                    1 3 . |   1.120902
                    2 . . |   1.239727
                    3 . . |   1.361019
                    4 . . |   1.473691
                    5 . . |   1.586759
                    6 . . |   1.700096
                    7 . . |   1.820969
                    8 . . |   1.965425
                    9 . . |    2.15501
                   10 . 1 |   2.329255
                   10 . 2 |   2.466163
                   10 . 3 |   2.898973
                ----------------------
                But

                1. You must decide in advance of the binning what data is to be included in the calculation.

                2. I can't follow why you think a different variable is needed for each bin.

                Comment


                • #9
                  Originally posted by Nick Cox View Post
                  Sorry, but I don't follow what new questions you're asking in #6 and #7. As already hinted I am not an economist and I don't work with returns data. I have no idea what "monthly excess return" means.

                  To back up, I understand that you want a division into decile bins, except that the top and bottom bin are each subdivided into three, so 3 + 8 + 3 = 14 bins.

                  There are no doubt other ways to do it, but this is as easy as I can make it -- with a different dataset.


                  Code:
                  webuse nlswork, clear
                  xtile decile = ln_wage, nq(10)
                  xtile decile1 = ln_wage if decile == 1, nq(3)
                  xtile decile10 = ln_wage if decile == 10, nq(3)
                  egen wanted = group(decile decile1 decile10), label missing
                  
                  egen mean = mean(ln_wage), by(wanted)
                  
                  tabdisp wanted, c(mean)
                  
                  ----------------------
                  group(dec |
                  ile |
                  decile1 |
                  decile10) | mean
                  ----------+-----------
                  1 1 . | .5510121
                  1 2 . | .987896
                  1 3 . | 1.120902
                  2 . . | 1.239727
                  3 . . | 1.361019
                  4 . . | 1.473691
                  5 . . | 1.586759
                  6 . . | 1.700096
                  7 . . | 1.820969
                  8 . . | 1.965425
                  9 . . | 2.15501
                  10 . 1 | 2.329255
                  10 . 2 | 2.466163
                  10 . 3 | 2.898973
                  ----------------------
                  But

                  1. You must decide in advance of the binning what data is to be included in the calculation.

                  2. I can't follow why you think a different variable is needed for each bin.
                  Thank you!! I am actually trying to get 16 bins, so 1A 1B 1C 1 .... til 10 and then again 10A 10B 10C. In my database the variable mf_u_mnret represents the 'monthly excess return', so I am trying to then get the monthly excess return for each decile

                  Comment


                  • #10
                    That is not what #3 said. But if your rules are different, then they imply the code you need.

                    Comment

                    Working...
                    X