Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Concentration measures for the other members of a set

    A question was asked and closed on Stack Overflow (SO) about HHIs. https://stackoverflow.com/questions/...lude-each-firm If you think Statalist is severe, try other sites. For once, I was not a voter to close.

    Already at a loss? HHI will be recognised by some as connoting Herfindahl and Hirschman (or vice versa) and more importantly the idea of measuring concentration of anything divided into proportional shares. At its simplest the measure (index) is just the sum of those proportions squared. To think how this behaves, consider extreme cases. If everything is in one category, there is just one positive proportion 1 and the sum of squared proportions is 1. If amounts or counts (sales, people, birds, bees, whatever) are divided equally among k categories, then the measure is k * (1/k)^2 = 1/k which tends towards 0 for arbitrarily large k. From that you can see that the reciprocal of this measure has an interpretation as an equivalent number of equally common categories and that the complement (1 minus the measure) measures the opposite of concentration (diversity, or whatever else you want to call that).

    This idea goes back at least about a century and has independently been invented (discovered, if you will) in several sciences, including mainstream statistics, despite frequent and unfortunate disciplinary myopia leading many to presume that the idea was first thought up in one's own field and so should be named for people in that field who earlier made a fuss about it. (There is a regrettable note by Hirschman claiming priority over Herfindahl, regardless of the fact that he was far from first either. I.J. Good once wrote that any competent statistician would take 2 seconds to come up with the formula, which seems to me a little exaggerated the other way, although I am not a statistician.)

    All that said, the question on SO was about calculating this measure (multiplied by 10000 for some bizarre reason) for sales for the other firms in the same market, i.e. excluding in turn each firm in the same market. Long-term readers of this forum are likely to recognise this kind of problem as a party piece for rangestat (SSC) written by Robert Picard and friends. Here is toy data, extending the SO example, some code and some results reproducing the hand calculations of the OP.

    The check using entropyetc (SSC) is not much of a check insofar as I wrote that too, but it flags another discussion of this territory, and use of the name Simpson, which is pretty much standard in ecology (the same Simpson as is named in Simpson's paradox) .

    Code:
    clear
    input str1 market firm sales
    A 1 10
    A 2 20
    A 3 50
    B 1 5
    B 2 15
    B 4 80
    end
    
    mata mata clear
    
    mata :
    
    real scalar matchprob(real colvector p) {
        p = select(p, (p :< .))
        if (rows(p) == 0) return(.)
        p = p / sum(p)
        return(sum(p:^2))
    }
    
    end
    
    egen id = group(market), label
    
    rangestat (matchprob) sales, int(id 0 0)
    rename matchprob1 standard
    rangestat (matchprob) sales, int(id 0 0) excludeself
    rename matchprob1 others
    format standard others %6.5f
    list, sepby(market)
    
         +-------------------------------------------------+
         | market   firm   sales   id   standard    others |
         |-------------------------------------------------|
      1. |      A      1      10    A    0.46875   0.59184 |
      2. |      A      2      20    A    0.46875   0.72222 |
      3. |      A      3      50    A    0.46875   0.55556 |
         |-------------------------------------------------|
      4. |      B      1       5    B    0.66500   0.73407 |
      5. |      B      2      15    B    0.66500   0.88927 |
      6. |      B      4      80    B    0.66500   0.62500 |
         +-------------------------------------------------+
    
    
    entropyetc firm [w=sales] , by(market)
    (analytic weights assumed)
    
    ----------------------------------------------------------------------
        Group |  Shannon H      exp(H)     Simpson   1/Simpson     dissim.
    ----------+-----------------------------------------------------------
            A |      0.900       2.460       0.469       2.133       0.375
            B |      0.613       1.846       0.665       1.504       0.550
    ----------------------------------------------------------------------
    Last edited by Nick Cox; 09 Aug 2018, 04:43.

  • #2
    You can obtain the same H and Simpson measure (here as 1-GV with GV = generalized variance, also also known as the Blau Index (Blau, 1977) or the Hirschman-Herfindahl Index (HHI)) also by using divcat (available on SSC) as follows:

    Code:
    clear
    input str1 market firm sales
    A 1 10
    A 2 20
    A 3 50
    B 1 5
    B 2 15
    B 4 80
    end
    
    egen id = group(market), label
    
    bys market: divcat firm [aw=sales], base(e)
    The result will be

    Code:
    Measures of Diversity by market
    
    -------------------------------------------------------------------------
                     | categs      GV     NGV       H      NH      RQ       n
    -----------------+-------------------------------------------------------
                   A |      3   0.531   0.797   0.900   0.819   0.828       3
                   B |      3   0.335   0.503   0.613   0.558   0.598       3
    -------------------------------------------------------------------------
    Note: Entropy (H) is calculated using the logarithm to base e

    Reference:

    Blau, P. M. (1977). Inequality and Heterogeneity. New York: Free Press.
    Last edited by Dirk Enzmann; 09 Aug 2018, 05:51.

    Comment


    • #3
      Direct calculation seems also well serving.
      Code:
      bys market: egen SumSale=total(sales)
      bys market: egen SumSqrSale=total(sales^2)
      
      gen HHI = SumSqrSale/(SumSales)^2
      gen CHHI = (SumSqrSale-sales^2)/(SumSales-sales)^2

      Comment


      • #4
        Romalpa Akzo Yes, excellent. Not robust to missings, but that is a detail.

        Comment

        Working...
        X