Dear Statalist,
I use Stata 11.2 on a Windows device and have a dataset similar to this one:
I want to create a variable that ranks the observation of another variable (v2) in descending order. There shall be a different ranking for each group of observations with identical values of a third variable (v1) :
The crucial point is, that I want ties to have the same rank number. However, the next observation should have the following number without skipping the next number as one would usually do (and Stata does as well) in this case. After
I get this result:
Instead the observations should look like that:
My first approach was to solve that problem with the help of a fourth variable. The idea was to create a variable with the value 1 for all observations. After that, I wanted to replace v4 of every observation with an increase of v3 by the previous value of v4+1 and every observation where this condition doesn’t hold with the previous value of v4 in case the previous value is larger and has the same value for the group variable v1. Therefore, I wrote this code:
I hoped that for observation 2 condition 1 holds (v4[_n-1]+1=2). After the change of observation 2 condition 2 holds for the next observation so it becomes 2 as well. Thereafter condition 1 should be fulfilled for observation 4 so it becomes v4[_n-1]+1=3 and so on.
That didn’t work since obviously Stata doesn’t perform the two commands one observation after the other but first command 1 for all observations, then command 2 for all observations. Since condition 1 depends on previous observations, the results for later observations changes if command 2 isn’t performed on the preceding observations, which leaves me with these values:
I tried to solve this problem with the following foreach command:
Unfortunately, the results didn’t change so I assume that Stata still performs command 1 for all observations before starting with command 2.
I would like to know, how I can tell Stata to perform both commands immediately in succession for one observation before continuing with the next one. Alternatively, I’d be also grateful for hints on any different way to solve the ranking problem.
I hope, I expressed my issue clear enough and apologize for any mistakes in my English.
Kind regards
Ingo
I use Stata 11.2 on a Windows device and have a dataset similar to this one:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str1 v1 int v2 "A" 150 "A" 200 "A" 100 "A" 150 "A" 100 "B" 80 "B" 70 "B" 50 "B" 80 "B" 80 "B" 90 "B" 40 "B" 60 "C" 400 "C" 100 "C" 50 "C" 50 "C" 400 "C" 100 "C" 150 end
Code:
egen v3 = rank(-v2), by(v1) track
Code:
gsort v1 -v2
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str1 v1 int v2 float v3 "A" 200 1 "A" 150 2 "A" 150 2 "A" 100 4 "A" 100 4 "B" 90 1 "B" 80 2 "B" 80 2 "B" 80 2 "B" 70 5 "B" 60 6 "B" 50 7 "B" 40 8 "C" 400 1 "C" 400 1 "C" 150 3 "C" 100 4 "C" 100 4 "C" 50 6 "C" 50 6 end
Instead the observations should look like that:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str1 v1 int v2 byte v3 "A" 200 1 "A" 150 2 "A" 150 2 "A" 100 3 "A" 100 3 "B" 90 1 "B" 80 2 "B" 80 2 "B" 80 2 "B" 70 3 "B" 60 4 "B" 50 5 "B" 40 6 "C" 400 1 "C" 400 1 "C" 150 2 "C" 100 3 "C" 100 3 "C" 50 4 "C" 50 4 end
Code:
gen v4=1 replace v4=v4[_n-1]+1 if v3>v3[_n-1] & _n>1 replace v4=v4[_n-1] if v4<v4[_n-1] & v3[_n]==v3[_n-1] & _n>1
That didn’t work since obviously Stata doesn’t perform the two commands one observation after the other but first command 1 for all observations, then command 2 for all observations. Since condition 1 depends on previous observations, the results for later observations changes if command 2 isn’t performed on the preceding observations, which leaves me with these values:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str1 v1 int v2 float(v3 v4) "A" 200 1 1 "A" 150 2 2 "A" 150 2 2 "A" 100 4 2 "A" 100 4 2 "B" 90 1 1 "B" 80 2 2 "B" 80 2 2 "B" 80 2 2 "B" 70 5 2 "B" 60 6 3 "B" 50 7 4 "B" 40 8 5 "C" 400 1 1 "C" 400 1 1 "C" 150 3 2 "C" 100 4 3 "C" 100 4 3 "C" 50 6 2 "C" 50 6 2 end
Code:
foreach x in v2{ replace v4=v4[_n-1]+1 if v3>v3[_n-1] & _n>1 replace v4=v4[_n-1] if v4<v4[_n-1] & v3[_n]==v3[_n-1] & _n>1 }
I would like to know, how I can tell Stata to perform both commands immediately in succession for one observation before continuing with the next one. Alternatively, I’d be also grateful for hints on any different way to solve the ranking problem.
I hope, I expressed my issue clear enough and apologize for any mistakes in my English.
Kind regards
Ingo
Comment