Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ineqdeco for multiple categories (land inequality)?

    Dear Statalist Community,

    I am having trouble calculating the theil index for land inequality using ineqdeco. I want to calculate this index for each of my municipalities. However, for each municipality, I have 10 categories of land plots by size. For each category I have data on the number of plots (num_cat*), and on the area of all plots by hectare (surfac_cat*).

    I wanted to calculate land inequality taking into account the surface and the number of plots in each category. Basically, my goal was to replicate this formula of the theil index: https://psychology.fandom.com/wiki/I...t_computations.

    However, I don't know how to operationalize that using ineqdeco, given that I have two things to consider (surface and number) and 10 categories...

    Thank you so much!!

    Cat

    Code:
    Variable    Obs    Mean    Std. dev.    Min    Max
                        
    muni_code    3,330    3598.177    2027.443    4    7159
    num_cat1    3,306    105.6086    167.9347    0    3295
    num_cat2    3,306    70.99274    97.24655    0    2726
    num_cat3    3,306    161.9897    168.7842    0    3361
    num_cat4    3,306    66.8902    65.31594    0    903
                        
    num_cat5    3,306    52.99335    59.39432    0    649
    num_cat6    3,303    22.65789    34.31665    0    406
    num_cat7    3,275    8.865954    18.14336    0    302
    num_cat8    3,245    2.153467    6.430017    0    121
    num_cat9    3,261    1.586017    4.444032    0    101
                        
    num_cat10    3,229    .4168473    .9689932    0    14
    surfac_cat1    3,204    23.85705    35.14102    0    890
    surfac_cat2    3,204    55.17603    76.61556    0    2165
    surfac_cat3    3,204    304.6957    313.7701    0    6267
    surfac_cat4   3,204    255.9566    251.9805    0    3485
                        
    surfac_cat5    3,204    363.7185    417.5487    0    4770
    surfac_cat6    3,204    312.0147    485.0082    0    5657
    surfac_cat7    3,204    261.8645    566.1812    0    10880
    surfac_cat8    3,204    150.0268    459.4285    0    8735
    surfac_cat9    3,204    324.7272    938.089    0    23031
                        
    surfac_cat10    3,204    668.2612    1724.493    0    19849







  • #2
    I don't know exactly what is a Theil index here. My recollection is that it is related to entropy in the Shannon sense.

    Thanks for the link, but it didn't give me a quick and easy answer.

    https://journals.sagepub.com/doi/pdf...6867X241276115 is a tutorial essay trying to underline that many of the indexes in this territory yield to a few lines of direct code. That essay may or may not be close to your set-up.

    Comment


    • #3
      Hi Nick,

      Thank you very much! Yes the theil index is an entropy measure. I did calculate it by hand using the instructions on that link, but wanted to calculate it using a package just to double check.

      I will look into the tutorial you sent me.

      Many thanks again.

      Comment


      • #4
        Hello again,

        I have a followup question.

        I managed to restructure my dataset in a long format so that now I have four variables: an identification variable for each municipality (municipality_id), category (with each of my 10 categories of land plots by size), and the corresponding number of land plots in each category (number), as well as the area of land occupied by each category (superf). This means my dataset now has 10 categories for each municipality, allowing me to use ineqdeco.

        If I'm not mistaken, I calculated it like this. So that the area occupied by each category is weighted by number of plots (if that makes sense ?).

        However... I don't know how to create a variable that gives me the theil index for each municipality (which I understand is the GE(1) indicator). Is it possible to do that?

        Once again thank you so much.


        UPDATE: I believe I may have to create a variable (gen w = superf/number) to use the command correctly: ineqdeco w, by(category). But I'm not sure about this...


        Code:
        ineqdeco superf [fw=number], by(category)

        Code:
        Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
        municipality_id |     32,020    3576.872    2033.534          4       7159
            category |     32,020         5.5    2.872326          1         10
              number |     32,020    49.29988    101.2329          0       3361
              superf |     32,020    271.9781    725.1904          0      23031

        (Note, this is the total output from ineqdeco: )

        Code:
        . ineqdeco superf [fw=number], by(category)
         
        Warning: superf has 45 values = 0. Not used in calculations
         
        Percentile ratios
        
        ----------------------------------------------------------
          All obs |    p90/p10     p90/p50     p10/p50     p75/p25
        ----------+-----------------------------------------------
                  |     38.793       4.182       0.108       7.210
        ----------------------------------------------------------
          
        Generalized Entropy indices GE(a), where a = income difference
         sensitivity parameter, and Gini coefficient
        
        ----------------------------------------------------------------------
          All obs |     GE(-1)       GE(0)       GE(1)       GE(2)        Gini
        ----------+-----------------------------------------------------------
                  |    2.95430     0.82697     0.69088     1.24846     0.60462
        ----------------------------------------------------------------------
          
        Atkinson indices, A(e), where e > 0 is the inequality aversion parameter
        
        ----------------------------------------------
          All obs |     A(0.5)        A(1)        A(2)
        ----------+-----------------------------------
                  |    0.31192     0.56263     0.85525
        ----------------------------------------------
          
        Subgroup summary statistics, for each subgroup k = 1,...,K:
          
        
        ---------------------------------------------------------------------------------
         category |   Popn. share          Mean Relative mean  Income share     log(mean)
        ----------+----------------------------------------------------------------------
                1 |       0.21802      74.99103       0.15190       0.03312       4.31737
                2 |       0.14505     159.49305       0.32307       0.04686       5.07200
                3 |       0.32694     633.28712       1.28279       0.41940       6.45092
                4 |       0.13269     502.45646       1.01778       0.13505       6.21951
                5 |       0.10532     833.12444       1.68759       0.17774       6.72518
                6 |       0.04575    1051.79861       2.13054       0.09747       6.95826
                7 |       0.01786    1424.36883       2.88522       0.05152       7.26148
                8 |       0.00432    1539.20757       3.11784       0.01346       7.33902
                9 |       0.00322    2924.18567       5.92327       0.01906       7.98077
               10 |       0.00083    3743.00607       7.58188       0.00633       8.22764
        ---------------------------------------------------------------------------------
          
        Subgroup indices: GE_k(a) and Gini_k
        
        ----------------------------------------------------------------------
         category |     GE(-1)       GE(0)       GE(1)       GE(2)        Gini
        ----------+-----------------------------------------------------------
                1 |    1.09650     0.60443     0.63483     1.17590     0.57118
                2 |    0.93962     0.56589     0.64026     1.37164     0.55858
                3 |    0.53610     0.37747     0.41143     0.68948     0.46557
                4 |    0.48688     0.31721     0.31252     0.42874     0.42156
                5 |    0.65181     0.36077     0.33359     0.42743     0.44161
                6 |    1.07054     0.46722     0.38325     0.45164     0.47924
                7 |    1.66760     0.65580     0.55023     0.74984     0.56184
                8 |    1.95316     0.84290     0.72246     0.98764     0.63344
                9 |    1.83596     0.77082     0.68199     1.02863     0.61329
               10 |    0.66375     0.44341     0.40515     0.49064     0.49453
        ----------------------------------------------------------------------
          
        Within-group inequality, GE_W(a)
        
        ----------------------------------------------------------
          All obs |     GE(-1)       GE(0)       GE(1)       GE(2)
        ----------+-----------------------------------------------
                  |    2.27350     0.45693     0.41607     0.97111
        ----------------------------------------------------------
                      
        Between-group inequality, GE_B(a):
        
        ----------------------------------------------------------
          All obs |     GE(-1)       GE(0)       GE(1)       GE(2)
        ----------+-----------------------------------------------
                  |    0.68079     0.37003     0.27481     0.27735
        ----------------------------------------------------------
                      
        Subgroup Atkinson indices, A_k(e)
        
        ----------------------------------------------
         category |     A(0.5)        A(1)        A(2)
        ----------+-----------------------------------
                1 |    0.26777     0.45361     0.68681
                2 |    0.25994     0.43215     0.65269
                3 |    0.17850     0.31440     0.51742
                4 |    0.14515     0.27182     0.49335
                5 |    0.15875     0.30286     0.56590
                6 |    0.19029     0.37326     0.68164
                7 |    0.26036     0.48098     0.76933
                8 |    0.33082     0.56954     0.79618
                9 |    0.30820     0.53737     0.78596
               10 |    0.19377     0.35816     0.57036
        ----------------------------------------------
          
        Within-group inequality, A_W(e)
        
        ----------------------------------------------
          All obs |     A(0.5)        A(1)        A(2)
        ----------+-----------------------------------
                  |    0.18725     0.33901     0.57293
        ----------------------------------------------
         
        Between-group inequality, A_B(e)
        
        ----------------------------------------------
          All obs |     A(0.5)        A(1)        A(2)
        ----------+-----------------------------------
                  |    0.15340     0.33831     0.66107
        ----------------------------------------------
        
        .
        end of do-file
        Last edited by Cat Santos; 12 Dec 2024, 14:02.

        Comment

        Working...
        X