How to sort the values of a variable within one observation while retaining the variable name?

AnneLilly

Join Date: Apr 2014
Posts: 3

How to sort the values of a variable within one observation while retaining the variable name?

04 Apr 2014, 13:59

Hello,

I am working with a data set that looks like this:

ID	Item1	Item2	Item3	Item4	Item5	Item6	Item7	Item8	Item9
1	20.2	14.6	37.0	82.7	16.0	82.1	71.1	76.2	78.0
2	12.1	6.7	35.0	76.7	5.0	90.2	93.3	76.7	75.0
3	72.8	22.0	45.6	91.8	25.6	96.1	88.1	96.0	85.8

Each row represents one school's response to a survey, with each column representing that school's response to each survey question.

I need to generate new variables that show the survey question that received the highest score and the value of that survey question for each school. Furthermore, I need to generate new variables that show the survey question that received the second highest score and the value of that survey question for each school. To clarify, I'd like to be able to create the table below:

ID	Item1	Item2	Item3	Item4	Item5	Item6	Item7	Item8	Item9	Highest Item	Highest Item Value	2nd Highest Item	2nd Highest Item Value	Lowest Item	Lowest Item Value	2nd Lowest Item	2nd Lowest Item Value
1	20.2	14.6	37.0	82.7	16.0	82.1	71.1	76.2	78.0	Item4	82.7	Item6	82.1	Item2	14.6	Item5	16.0
2	12.1	6.7	35.0	76.7	5.0	90.2	93.3	76.7	75.0	Item7	93.3	Item6	90.2	Item5	5.0	Item2	6.7
3	72.8	22.0	45.6	91.8	25.6	96.1	88.1	96.0	85.8	Item6	96.1	Item8	96.0	Item2	22.0	Item5	25.6

I searched through previous Statalist postings and on some other websites to try to find the best way to go about doing this. One source suggested using reshape then sort then reshape again. The resulting data set would have all of the items for each school organized from lowest score to highest score, but I would not have any way of identifying which item corresponds to each score.

Thank you for any guidance you can provide!
Best,
Anne

Tags: None

Joe Canner

Join Date: Mar 2014
Posts: 580

04 Apr 2014, 14:41

Anne,

Using two reshape commands is right, but you need a bit more than that:

Code:

reshape long Item, i(ID) j(item_no)

bysort ID (Item): gen lowest=Item[1]
bysort ID (Item): gen which_is_lowest="Item"+string(item_no[1])

bysort ID (Item): gen second_lowest=Item[2]
bysort ID (Item): gen which_is_2nd_lowest="Item"+string(item_no[2])

bysort ID (Item): gen highest=Item[_N]
bysort ID (Item): gen which_is_highest="Item"+string(item_no[_N])

bysort ID (Item): gen second_highest=Item[_N-1]
bysort ID (Item): gen which_is_2nd_highest="Item"+string(item_no[_N-1])

reshape wide Item, i(ID) j(item_no)

This can be shortened and generalized to:

Code:

reshape long Item, i(ID) j(item_no)

forvalues x=1/9 {
  bysort ID (Item): gen rank`x'=Item[`x']
  bysort ID (Item): gen which`x'="Item"+string(item_no[`x'])
}

reshape wide Item, i(ID) j(item_no)

rank1 is the lowest, rank2 is the second lowest, rank8 is the second highest, and rank9 is the highest. The rest can be dropped if desired.

Regards,
Joe

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35207
#3

05 Apr 2014, 03:33

search rowsort (SJ)
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35207
#4

07 Apr 2014, 03:45

To expand on my earlier remark (typed on a mobile while away from base):

There is a review of working rowwise in

SJ-9-1 pr0046 . . . . . . . . . . . . . . . . . . . Speaking Stata: Rowwise
(help rowsort, rowranks if installed) . . . . . . . . . . . N. J. Cox
Q1/09 SJ 9(1):137--157
shows how to exploit functions, egen functions, and Mata
for working rowwise; rowsort and rowranks are introduced

which is accessible at http://www.stata-journal.com/sjpdf.h...iclenum=pr0046
Comment
AnneLilly

Join Date: Apr 2014

Posts: 3
#5

07 Apr 2014, 13:59

Thank you, Joe and Nick. Your posts were helpful!

I have two additional questions:

1. When I ran the code to tag the lowest and second lowest items, the tagging worked well. However, when I ran this code for the highest and next highest items:

bysort ID (Item): gen highest=Item[_N] bysort ID (Item): gen which_is_highest="Item"+string(item_no[_N]) bysort ID (Item): gen second_highest=Item[_N-1] bysort ID (Item): gen which_is_2nd_highest="Item"+string(item_no[_N-1]) ...the "highest" variable ended up being missing for all observations. For all of my respondents, there are some survey questions (variables) with missing data. I know Stata typically considers missing data to be larger than any nonmissing data, but is there a command I could use in this case to get Stata to exclude the missing values from its designation of the highest item?

2. I read up on the rowranks and rowsort commands in The Stata Journal article that you referenced. With rowranks, in the case of a tie, is there a way to get Stata to assign the rank to one variable, assigning a blank rank to all of the other variables with the same value?

Basically, I would want the ranks to look like this:
Observation Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10

Obs1 81 90 90 70 90 65 100 95 81 98

Rank 3 4 2 1 7 5 6

Thank you again for your help!
Anne
Comment
Joe Canner

Join Date: Mar 2014

Posts: 580
#6

07 Apr 2014, 15:33

Anne,

After I posted, I wondered if that issue would bite. I think you can solve it by dropping missing values after the first reshape:

Code:

reshape long Item, i(ID) j(item_no) drop if mi(Item) .... reshape wide Item, i(ID) j(item_no)

This will eliminate missings from consideration as possible maximums, but when the second reshape happens, the missings will be restored. rowsort has it's own way of dealing with this issue. Regards, Joe
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35207
#7

08 Apr 2014, 07:11

rowranks (SJ) does not support the idiosyncratic ranking you want, which consists of ranking only distinct marks.
Comment
AnneLilly

Join Date: Apr 2014

Posts: 3
#8

08 Apr 2014, 12:13

Thank you!
Comment

Observation	Var1	Var2	Var3	Var4	Var5	Var6	Var7	Var8	Var9	Var10
Obs1	81	90	90	70	90	65	100	95	81	98
Rank	3	4		2		1	7	5		6

Announcement

How to sort the values of a variable within one observation while retaining the variable name?

Comment

Comment

Comment

Comment

Comment

Comment

Comment