Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matrix operations: foreach, if

    Hi everyone,

    I am wirting my Master-Thesis and I am relatively new to Stata and especially to Mata.
    My problem is the following: I have observations about the population share and the county every person is living at.

    Now I need to create a matrix to compute several diversity indices (cultural diversity, linguistic diversity, genetic diversity).

    For one county it is not a problem to develop that matrix, I did it in the following way:

    mkmat share if county==1, matrix(share1)

    But because there are a lot of other counties and a lot of years I am examining, I don't like to write that command every single time for every single county.
    I have also tried it with foreach and forvalues but I did not get one matrix for one county but I got one matrix for all counties together. But with that matrix for all counties I cannot compute my diversity indices the right way.

    I hope, you understood my problem.


    Best,

    Hans


  • #2
    Hans, you may get more luck in the General forum on Stata, rather than in the Mata forum with your question.

    Also, I don't see any reason for using matrices at all for your task. Why not operate on data directly? Of course your index can be something very special and complicated, but other indices are often computed without the use of matrices at all: Gini, Atkinson, etc.
    Stata is flexible, and can store data in one matrix per country, one matrix for all countries, or reuse one matrix for country after country (usually the preferred way if your computations are independent between countries, and you don't need to store the results).

    Best, Sergiy Radyakin

    Comment


    • #3
      Hey Sergiy, maybe I should describe my problem in more detail. I think I have to use Mata, or let's say, Mata makes life much easier, because my data has the following format: I have the County every person is living in, the origin and the population share (P).
      County Origin Populationshare
      1 A 0.5
      1 B 0.5
      1 C 0
      1 D 0
      2 A 0.25
      2 B 0.25
      2 C 0.5
      2 D 0
      3 A 0.25
      3 B 0.25
      3 C 0.25
      3 D 0.25
      4 A 0
      4 B 0.25
      4 C 0
      4 D 0.75
      The index I would like to compute is the Herfindahl-Hirschmann-Index. It is computed in the following way:

      si and sj are the population shares. dij is a measure of distance between different populations, for example cultural diversity, linguistic diversity, etc.
      Now, I need at first a matrix of the population share foor each county. I achieved to develop a matrix for one county with

      mkmat share if county==1, matrix(share1)

      but it is a lot of work to run this operation for all counties. So, I am asking for a command to do this in a loop. I need this matrix for each county, because after that I need a diagonal matrix of the population shares for each county to compute my diversity index.

      In addition to that I have a second question.

      The diversity index is the same as above and the data regarding population share, county and origin is the same as well.

      Now, I have got a distance measure in this format:

      Origin1 Origin2 Distance
      A B 0.3
      A C 0.5
      A D 0.9
      B C 0.2
      B D 0.1
      C D 0.8

      Now I need to compute the product si*sj*dij.
      To be honest, I am not quite sure how to achieve this. My first thought was to compute the product si*sj with the matrix from above and then multiplicate it with dij. But I am not sure, if it works that way, or if I at first have to multiplicate si with dij in certain formats. My next problem is that I did not get a distance measure for AA, which would be euqal to 0, but it would make life much easier. So I only need to multiplicate sA*sB with dAB but I don't have to multiplicate sA*sA with the distance measure, and moreover I do not need to compute sB*sA with dAB because I have already done that before (it is the same like sA*sB*dA).


      Can someone help me with one of the two questions?
      I hope, you understand my problem.

      Hans






      Comment


      • #4
        My first thought was to compute ...
        My first thought was to Google it:

        HH-index:
        https://ideas.repec.org/c/boc/bocode/s457512.html

        Also see: https://ideas.repec.org/c/boc/bocode/s365801.html
        from Nick Cox in light of the following discussion: http://www.stata.com/statalist/archi.../msg00429.html

        See the following 3-line solution posted by Austin Nichols
        http://www.stata.com/statalist/archi.../msg00261.html

        Haven't seen this released, Google or contact the author if interested:
        http://www.stata.com/meeting/2italian/Dessy.pdf

        You might be programming some variation, I admit I don't have time to follow your clarifications now, but even if your formula for index is different from the ones implemented above, see if you can still use the same techniques. Don't re-invent the wheel, use one when available.

        Comment


        • #5
          Hey Sergiy,

          thanks for the links, I've already found them before and the HH-index computed there is a simpler version of the HH-index I need.
          But I think I've found another solution to compute the index, I only need to get my variables in the right format. Because of that I am going to write in the Stata forum.

          Thank you very much for your help!

          Comment

          Working...
          X