Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Gini Coefficient for Grouped Data

    Hi everyone,

    I need to calculate Gini Coefficient for a dataset that is organised as follows.

    County ID - Group - Number of People - Average Income
    1 - 1 - 100 - 234
    1 - 2 - 264 - 745
    1 - 3 - 145 - 1,565

    and so on...for multiple counties. Each county has group level data and each group represents a particular income bracket(1 is between 0 - 500, 2 is between 500-1,000, 3 is 1,000-2,000 and so on. My assumption is that for Group 1, all 100 people have an income of 234, for Group 2 all 264 people have an income of 745 and so on. How can I calculate Gini coefficient at county level?

    If I use ginidesc Average Income, bygroup(County ID), Stata assumes one observation in each group and then calculates Gini Coefficient. In my case each group has multiple observations so I need some other command.

    Any help or suggestion is highly appreciated.

    Thanks a ton

    Best Regards
    Pushkar

  • #2
    Dear Pushkar Singh,

    in general the estimation of the Gini index from grouped data leads to an underestimation of the true Gini, essentially because grouping "obscures" within-group inequality.
    For a discussion on this issue and a proposed solution (among others), see for example this paper: http://www.mitpressjournals.org/doi/...2/REST_a_00103.

    However if you still want to compute the Gini index without any correction, you can use frequency weights.
    Code:
    ginidesc Average Income [fw = Number of People], bygroup(County ID)
    Hope it helps,

    Marco

    Comment


    • #3
      Thanks a ton Marco

      Comment


      • #4
        Originally posted by Marco Savegnago View Post
        Dear Pushkar Singh,

        in general the estimation of the Gini index from grouped data leads to an underestimation of the true Gini, essentially because grouping "obscures" within-group inequality.
        For a discussion on this issue and a proposed solution (among others), see for example this paper: http://www.mitpressjournals.org/doi/...2/REST_a_00103.

        However if you still want to compute the Gini index without any correction, you can use frequency weights.
        Code:
        ginidesc Average Income [fw = Number of People], bygroup(County ID)
        Hope it helps,

        Marco
        Thanks a ton Marco. Appreciate your feedback. Have read the discussion, it is insightful

        Comment


        • #5
          Hi everyone,

          as Pushkar, I want to calculate a Gini coefficient for grouped data. I am aware of the warning from Marco, that the estimation of the Gini from grouped data leads to an underestimation of the true Gini.

          Is there a Stata command which computes the Gini from grouped data with a correction, for example as proposed in the paper posted? (or any other method)

          Thank you!

          Best Regards
          Martin

          Comment


          • #6
            If there were a Stata command to do this, it would surely need some flexibility given options such as

            1. Use the midpoint of the interval bounds.

            2. Use the geometric mean of the interval bounds (with modification for any interval starting with zero).

            3. Impute according to a uniform distribution within each interval.

            4. #3 but some skewed distribution instead.

            5. That can't be a complete list of not quite crazy options.

            Disclaimer: I've not read the paper cited.

            Comment

            Working...
            X