Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating the Palma Ratio for Thesis

    Hey, i am writing my thesis on income inequality, and I want to calculate and generate the Palma Ratio for my dataset. Though I have seen some codes here, I actually could not do much with it. I have data on net income of households and the regions they are living in. So my variables are "income" and "region", how can I construct the Palma Ratio now in STATA?

  • #2
    See if this works.

    Code:
    g income10 = .
    levelsof region, local(levels)
    foreach reg of local levels {
        capture drop incomex
        xtile incomex = income if region=="`reg'", n(10)
        replace income10 = incomex if region=="`reg'"
    }    
    egen top10 = sum(cond(income10==10,income,.)) , by(region)
    egen bottom40 = sum(cond(income10<=4,income,.)) , by(region)
    g palma = top10/bottom40

    Comment


    • #3
      Thank you for your help, but it did not work for me. The error is "unexpected end of file".

      Comment


      • #4
        are you sure all the variables names (region, income) have been replaced by your variable names?

        do

        Code:
        set trace on
        then run it and see where it fails.

        Comment


        • #5
          Based on George Ford's code, ignore the region, and if there is a weight:
          Code:
          xtile income10 = income, n(10)
          summ INC [w=factor] if income10 ==10
          local NUM = r(sum)
          summ INC [w=factor] if inrange(ncome10,1,4)
          local DEN = r(sum)
          display as text "Palma=" as result `NUM'/`DEN'
          c.f.https://www.researchgate.net/post/Ho...ma-using-STATA see Gonzalo DurĂ¡n S.

          Comment


          • #6
            To calculate the Palma ratio, use -sumdist- on SSC (old program by me on SSC: Something like

            Code:
            sumdist income [aw = wgtvar], ngp(10)  // assuming "wgtvar" is the weight variable
            local palma = r(sh10)/ ( r(sh1) + r(sh2) + r(sh3) + r(sh4) )
            di "`palma'"
            Alternatively, a more modern and direct approach is to use -dstat- (Ben Jann, SSC), which calculates the Palma ratio directly as an option. It also gives you standard errors.

            Adapt either approach to get estimates by region

            Comment


            • #7
              Dear Stephen Jenkins, thank you for your codes. I have a question about palma ratio. I use the code in #6 and -dstat- to calculate palma, and they give me different results. Below is my codes and results, as you will see, the code in #6 computes palma ratio at 0.761, but the dstat estimates palma ratio at 0.809. What's the difference between the two method?
              Code:
              sysuse auto
              sumdist price, ngp(10)
              local palma = r(sh10)/ ( r(sh1) + r(sh2) + r(sh3) + r(sh4) )
              di "`palma'"
              dstat (palma) price
              Code:
              . di "`palma'"
              .7608973839799137
              
              . dstat (palma) price
              
              Summary statistics                Number of obs   =         74
              
              --------------------------------------------------------------
                     price |      Coef.   Std. Err.     [95% Conf. Interval]
              -------------+------------------------------------------------
                     palma |   .8099747   .0541021      .7021493    .9178002
              --------------------------------------------------------------

              Comment


              • #8
                I don't know what the issue is. If you show all the output you get

                Code:
                . sysuse auto
                (1978 automobile data)
                
                . sumdist price, ngp(10)
                 
                Distributional summary statistics, 10 quantile groups
                
                ---------------------------------------------------------------------------
                Quantile  |
                group     |    Quantile  % of median     Share, %      L(p), %        GL(p)
                ----------+----------------------------------------------------------------
                        1 |    3895.000       77.799        6.428        6.428      396.297
                        2 |    4099.000       81.874        6.178       12.606      777.176
                        3 |    4425.000       88.385        7.511       20.117     1240.270
                        4 |    4647.000       92.819        6.946       27.063     1668.514
                        5 |    4934.000       98.552        7.352       34.415     2121.784
                        6 |    5705.000      113.952        9.260       43.675     2692.689
                        7 |    6165.000      123.140        8.999       52.674     3247.473
                        8 |    7827.000      156.337       11.720       64.394     3970.068
                        9 |   11385.000      227.404       15.014       79.408     4895.689
                       10 |                                20.592      100.000     6165.257
                ---------------------------------------------------------------------------
                Share = quantile group share of total price; 
                L(p)=cumulative group share; GL(p)=L(p)*mean(price)
                
                . local palma = r(sh10)/ ( r(sh1) + r(sh2) + r(sh3) + r(sh4) )
                
                . di "`palma'"
                .7608973839799137
                
                . dstat (palma) price
                
                Summary statistics                Number of obs   =         74
                
                --------------------------------------------------------------
                       price | Coefficient  Std. err.     [95% conf. interval]
                -------------+------------------------------------------------
                       palma |   .8099747   .0541021      .7021493    .9178002
                --------------------------------------------------------------
                
                . 
                end of do-file
                
                . di 20.592/ (6.428 + 6.178 + 7.511 + 6.946)
                .76089125
                The final lines suggest that my code is OK? However, I doubt that there is an error in code by Ben Jann ! If I do a -viewsource dstat.ado- and search for "palma" I see

                Code:
                void ds_sum_palma(`Data' D, `Grp' G, `Int' i)
                {
                    `RS' b1, b2
                    `RC' IF1
                    
                    _ds_sum_share(D, G, i, 0\.4) // bottom 40%
                    b1 = D.b[i]
                    if (D.noIF==0) IF1 = D.IF[,i]
                    _ds_sum_share(D, G, i, .9\1) // top 10%
                    b2 = D.b[i]
                    D.b[i] = b2 / b1
                    if (_ds_sum_omit(D, i)) return
                    if (D.noIF) return
                    D.IF[,i] = D.IF[,i] / b1 - b2/b1^2 * IF1
                }
                I don't know enough Mata to follow all the precise details ... though it's clear Ben and I are both aiming to calculate the Palma ratio as ratio of income share of top 10% divided by share of poorest 40%

                Comment


                • #9
                  Thank you professor Stephen Jenkins. The contents in the first code block are exactly what I get. And professor Ben Jann's command is too complex to check. Perhaps he could take notice of this thread.

                  Comment


                  • #10
                    Hi Chen and Stephen

                    it's clear Ben and I are both aiming to calculate the Palma ratio as ratio of income share of top 10% divided by share of poorest 40%
                    Absolutely. However, sumdist and dstat differ in how they compute these shares. dstat uses the same approach as pshare, which is described in Jann (2016). In particular, the difference is that sumdist does not break ties and does not interpolate flat regions in the distribution function (see footnote 2 on page 265 in Jann 2016).

                    BTW: Using pshare would be yet another way to compute the Palma ratio; see page 295 in Jann (2016).

                    Code:
                    . sysuse auto, clear
                    (1978 automobile data)
                    
                    . pshare estimate price [pw=1], percentiles(40 90)
                    
                    Percentile shares (proportion)              Number of obs = 74
                    
                    --------------------------------------------------------------
                           price | Coefficient  Std. err.     [95% conf. interval]
                    -------------+------------------------------------------------
                            0-40 |   .2665574   .0130113      .2406259    .2924889
                           40-90 |   .5175379   .0152959      .4870531    .5480227
                          90-100 |   .2159047   .0102057      .1955647    .2362447
                    --------------------------------------------------------------
                    
                    . nlcom (Palma: _b[90-100] / _b[0-40])
                    
                           Palma: _b[90-100] / _b[0-40]
                    
                    ------------------------------------------------------------------------------
                           price | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
                    -------------+----------------------------------------------------------------
                           Palma |   .8099747   .0589846    13.73   0.000      .694367    .9255825
                    ------------------------------------------------------------------------------
                    Result is the same as from dstat (apart from a somewhat different standard error; in lager samples, the difference should be negligible).

                    ben


                    Jann, Ben. 2016. Assessing inequality using percentile shares. The Stata Journal 16(2): 264-300

                    Comment


                    • #11
                      Thank you Ben, much appreciated.

                      Comment


                      • #12
                        thanks, Ben Jann
                        Code:
                        Absolutely. However, sumdist and dstat differ in how they compute these shares. dstat uses the same approach as pshare, which is described in Jann (2016). In particular, the difference is that sumdist does not break ties and does not interpolate flat regions in the distribution function (see footnote 2 on page 265 in Jann 2016).
                        Ben is of course absolutely right. (My claims about his coding were correct!) Put differently, the differences between -sumdist- (and -svylorenz-) and -dstat- become negligible when the # ties becomes negligible and there are few flat regions. Using the auto dataset is a great illustration of how Ben's more sophisticated (and 10 years younger) code does a better job.

                        Note that there are only 74 observations and how -xtile- (workhorse underneath -sumdist- and -svylorenz- to define the quantile groups) splits them

                        Code:
                        . sumdist price, ngp(10) qgp(decgp)
                         
                        Distributional summary statistics, 10 quantile groups
                        
                        ---------------------------------------------------------------------------
                        Quantile  |
                        group     |    Quantile  % of median     Share, %      L(p), %        GL(p)
                        ----------+----------------------------------------------------------------
                                1 |    3895.000       77.799        6.428        6.428      396.297
                                2 |    4099.000       81.874        6.178       12.606      777.176
                                3 |    4425.000       88.385        7.511       20.117     1240.270
                                4 |    4647.000       92.819        6.946       27.063     1668.514
                                5 |    4934.000       98.552        7.352       34.415     2121.784
                                6 |    5705.000      113.952        9.260       43.675     2692.689
                                7 |    6165.000      123.140        8.999       52.674     3247.473
                                8 |    7827.000      156.337       11.720       64.394     3970.068
                                9 |   11385.000      227.404       15.014       79.408     4895.689
                               10 |                                20.592      100.000     6165.257
                        ---------------------------------------------------------------------------
                        Share = quantile group share of total price; 
                        L(p)=cumulative group share; GL(p)=L(p)*mean(price)
                        
                        . ta decgp
                        
                           Quantile |
                              group |      Freq.     Percent        Cum.
                        ------------+-----------------------------------
                                  1 |          8       10.81       10.81
                                  2 |          7        9.46       20.27
                                  3 |          8       10.81       31.08
                                  4 |          7        9.46       40.54
                                  5 |          7        9.46       50.00
                                  6 |          8       10.81       60.81
                                  7 |          7        9.46       70.27
                                  8 |          8       10.81       81.08
                                  9 |          7        9.46       90.54
                                 10 |          7        9.46      100.00
                        ------------+-----------------------------------
                              Total |         74      100.00

                        Comment


                        • #13
                          PS (again hats off to Ben Jann) Ben's -pshare- with the step option provides results matching those of -sumdist- (and -svylorenz-).

                          Code:
                          . sumdist price
                           
                          Distributional summary statistics, 10 quantile groups
                          
                          ---------------------------------------------------------------------------
                          Quantile  |
                          group     |    Quantile  % of median     Share, %      L(p), %        GL(p)
                          ----------+----------------------------------------------------------------
                                  1 |    3895.000       77.799        6.428        6.428      396.297
                                  2 |    4099.000       81.874        6.178       12.606      777.176
                                  3 |    4425.000       88.385        7.511       20.117     1240.270
                                  4 |    4647.000       92.819        6.946       27.063     1668.514
                                  5 |    4934.000       98.552        7.352       34.415     2121.784
                                  6 |    5705.000      113.952        9.260       43.675     2692.689
                                  7 |    6165.000      123.140        8.999       52.674     3247.473
                                  8 |    7827.000      156.337       11.720       64.394     3970.068
                                  9 |   11385.000      227.404       15.014       79.408     4895.689
                                 10 |                                20.592      100.000     6165.257
                          ---------------------------------------------------------------------------
                          Share = quantile group share of total price; 
                          L(p)=cumulative group share; GL(p)=L(p)*mean(price)
                          
                          
                          . pshare price, nq(10)  // default -- addresses ties and flat regions
                          
                          Percentile shares (proportion)              Number of obs = 74
                          
                          --------------------------------------------------------------
                                 price | Coefficient  Std. err.     [95% conf. interval]
                          -------------+------------------------------------------------
                                  0-10 |   .0591567   .0034043      .0523719    .0659415
                                 10-20 |   .0651037   .0033929      .0583418    .0718657
                                 20-30 |   .0691512   .0035005      .0621747    .0761278
                                 30-40 |   .0731457   .0034224      .0663249    .0799666
                                 40-50 |   .0775944   .0033376      .0709424    .0842463
                                 50-60 |   .0850976   .0038908      .0773433    .0928519
                                 60-70 |   .0947857   .0033161      .0881768    .1013947
                                 70-80 |   .1061822   .0050732      .0960714    .1162931
                                 80-90 |    .153878     .01664      .1207145    .1870414
                                90-100 |   .2159047   .0102057      .1955647    .2362447
                          --------------------------------------------------------------
                          
                          . pshare price, nq(10) step
                          
                          Percentile shares (proportion)              Number of obs = 74
                          
                          --------------------------------------------------------------
                                 price | Coefficient  Std. err.     [95% conf. interval]
                          -------------+------------------------------------------------
                                  0-10 |   .0642791   .0036629       .056979    .0715792
                                 10-20 |   .0617782   .0032048      .0553911    .0681653
                                 20-30 |   .0751136   .0037532      .0676335    .0825937
                                 30-40 |   .0694607   .0032059      .0630714    .0758501
                                 40-50 |   .0735201   .0031556      .0672309    .0798093
                                 50-60 |   .0926004   .0047564       .083121    .1020798
                                 60-70 |   .0899855   .0029514      .0841035    .0958676
                                 70-80 |   .1172043   .0072896      .1026761    .1317325
                                 80-90 |   .1501351   .0148564      .1205263     .179744
                                90-100 |   .2059229   .0094743      .1870407    .2248051
                          --------------------------------------------------------------

                          Comment


                          • #14
                            Follow #13: I found that -sumdist- & -pshare- had crossed at least 6 years ago https://www.statalist.org/forums/for...ist-and-pshare
                            BTW, could the -sumdist- add another column to display mean of variable in each decile group? Thus we can avoid to execute xtile and tabstat to get the mean of each decile group. Although with the option qgp(newvarname), we can create a new variable (say qgpvar) that identifies the quantile group membership of each observation, and run tabstat variable, by(qgpvar) to get what I want.
                            Last edited by Chen Samulsion; 05 Jan 2025, 20:07.

                            Comment


                            • #15
                              Chen Samulsion request noted but I'm unlikely to act on it, not simply because of lack of time. You've given a good/easy way to get those means yourself, and is probably what I'd use myself. Also note that the returned results for the generalized Lorenz ordinates take you almost there anyway. For p = F(y), GL(p) = p*mean(p). See reference books

                              Comment

                              Working...
                              X