Matrix operations: foreach, if

Tobias Scherl

Join Date: Jul 2015

Posts: 29
#1

Matrix operations: foreach, if

07 Jul 2015, 10:02

Hi everyone,

I am wirting my Master-Thesis and I am relatively new to Stata and especially to Mata.
My problem is the following: I have observations about the population share and the county every person is living at.

Now I need to create a matrix to compute several diversity indices (cultural diversity, linguistic diversity, genetic diversity).

For one county it is not a problem to develop that matrix, I did it in the following way:

mkmat share if county==1, matrix(share1)

But because there are a lot of other counties and a lot of years I am examining, I don't like to write that command every single time for every single county.
I have also tried it with foreach and forvalues but I did not get one matrix for one county but I got one matrix for all counties together. But with that matrix for all counties I cannot compute my diversity indices the right way.

I hope, you understood my problem.

Best,

Hans
Tags: None
Sergiy Radyakin

Join Date: Apr 2014

Posts: 1867
#2

07 Jul 2015, 10:20

Hans, you may get more luck in the General forum on Stata, rather than in the Mata forum with your question.

Also, I don't see any reason for using matrices at all for your task. Why not operate on data directly? Of course your index can be something very special and complicated, but other indices are often computed without the use of matrices at all: Gini, Atkinson, etc.
Stata is flexible, and can store data in one matrix per country, one matrix for all countries, or reuse one matrix for country after country (usually the preferred way if your computations are independent between countries, and you don't need to store the results).

Best, Sergiy Radyakin
Comment
Tobias Scherl

Join Date: Jul 2015

Posts: 29
#3

08 Jul 2015, 03:15

Hey Sergiy, maybe I should describe my problem in more detail. I think I have to use Mata, or let's say, Mata makes life much easier, because my data has the following format: I have the County every person is living in, the origin and the population share (P).

County Origin Populationshare

1 A 0.5

1 B 0.5

1 C 0

1 D 0

2 A 0.25

2 B 0.25

2 C 0.5

2 D 0

3 A 0.25

3 B 0.25

3 C 0.25

3 D 0.25

4 A 0

4 B 0.25

4 C 0

4 D 0.75

The index I would like to compute is the Herfindahl-Hirschmann-Index. It is computed in the following way:

s_i and s_j are the population shares. d_ij is a measure of distance between different populations, for example cultural diversity, linguistic diversity, etc.
Now, I need at first a matrix of the population share foor each county. I achieved to develop a matrix for one county with

mkmat share if county==1, matrix(share1)

but it is a lot of work to run this operation for all counties. So, I am asking for a command to do this in a loop. I need this matrix for each county, because after that I need a diagonal matrix of the population shares for each county to compute my diversity index.

In addition to that I have a second question.

The diversity index is the same as above and the data regarding population share, county and origin is the same as well.

Now, I have got a distance measure in this format:

Origin1 Origin2 Distance

A B 0.3

A C 0.5

A D 0.9

B C 0.2

B D 0.1

C D 0.8

Now I need to compute the product s_i*s_j*d_ij.
To be honest, I am not quite sure how to achieve this. My first thought was to compute the product s_i*s_j with the matrix from above and then multiplicate it with d_ij. But I am not sure, if it works that way, or if I at first have to multiplicate s_iwith dij in certain formats. My next problem is that I did not get a distance measure for AA, which would be euqal to 0, but it would make life much easier. So I only need to multiplicate s_A*s_B with d_AB but I don't have to multiplicate s_A*s_A with the distance measure, and moreover I do not need to compute s_B*s_A with d_AB because I have already done that before (it is the same like s_A*s_B*dA).

Can someone help me with one of the two questions?
I hope, you understand my problem.

Hans
Comment
Sergiy Radyakin

Join Date: Apr 2014

Posts: 1867
#4

08 Jul 2015, 07:34

My first thought was to compute ...

My first thought was to Google it:

HH-index:
https://ideas.repec.org/c/boc/bocode/s457512.html

Also see: https://ideas.repec.org/c/boc/bocode/s365801.html
from Nick Cox in light of the following discussion: http://www.stata.com/statalist/archi.../msg00429.html

See the following 3-line solution posted by Austin Nichols
http://www.stata.com/statalist/archi.../msg00261.html

Haven't seen this released, Google or contact the author if interested:
http://www.stata.com/meeting/2italian/Dessy.pdf

You might be programming some variation, I admit I don't have time to follow your clarifications now, but even if your formula for index is different from the ones implemented above, see if you can still use the same techniques. Don't re-invent the wheel, use one when available.
Comment
Tobias Scherl

Join Date: Jul 2015

Posts: 29
#5

09 Jul 2015, 05:32

Hey Sergiy,

thanks for the links, I've already found them before and the HH-index computed there is a simpler version of the HH-index I need.
But I think I've found another solution to compute the index, I only need to get my variables in the right format. Because of that I am going to write in the Stata forum.

Thank you very much for your help!
Comment

County	Origin	Populationshare
1	A	0.5
1	B	0.5
1	C	0
1	D	0
2	A	0.25
2	B	0.25
2	C	0.5
2	D	0
3	A	0.25
3	B	0.25
3	C	0.25
3	D	0.25
4	A	0
4	B	0.25
4	C	0
4	D	0.75

Origin1	Origin2	Distance
A	B	0.3
A	C	0.5
A	D	0.9
B	C	0.2
B	D	0.1
C	D	0.8

Announcement

Matrix operations: foreach, if

Comment

Comment

Comment

Comment