partly transpose the dataset

Claudia Meurs

Join Date: Dec 2023

Posts: 3
#1

partly transpose the dataset

28 Feb 2025, 04:24

I have a database like this:
id sub_id fol_id A B

1 1 1 15 6

2 1 1 20 6

2 2 1 25 7

3 1 1 25 6

3 2 1 15 7

3 2 2 20 7

3 2 3 25 7

4 1 1 15 6

5 1 1 15 5

5 1 2 15 6

duplicates of the combination of the variables id and sub_id are currently seperate rows in the datase and they can be distinqued by fol_id. I would like these to be in seperate columns. In this example the combinations: 'id=3 and sub-id=2' and 'id=5 and sub_id=1'

so the datase would be changed into:
id sub_id fol_id1 A1 B1 fol_id2 A2 B2 fol_id3 A3 B3

1 1 1 15 6

2 1 1 20 6

2 2 1 25 7

3 1 1 25 6

3 2 1 15 7 2 20 7 3 25 7

4 1 1 15 6

5 1 1 15 5 2 15 6

Could anyone give me tips how to do this?
Tags: None

Nick Cox

Join Date: Mar 2014
Posts: 35239

28 Feb 2025, 04:45

Thanks for your data example, which can translated easily to a dataex example. (See FAQ Advice #12.)

You can get what you ask for as a reshape wide, although as is true of almost all reshape wide applications, there is a question of why you want this and whether you are better off afterwards.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(id sub_id fol_id a b)
1 1 1 15 6
2 1 1 20 6
2 2 1 25 7
3 1 1 25 6
3 2 1 15 7
3 2 2 20 7
3 2 3 25 7
4 1 1 15 6
5 1 1 15 5
5 1 2 15 6
end

reshape wide a b, i(id sub_id) j(fol_id)

list  

     +-------------------------------------------+
     | id   sub_id   a1   b1   a2   b2   a3   b3 |
     |-------------------------------------------|
  1. |  1        1   15    6    .    .    .    . |
  2. |  2        1   20    6    .    .    .    . |
  3. |  2        2   25    7    .    .    .    . |
  4. |  3        1   25    6    .    .    .    . |
  5. |  3        2   15    7   20    7   25    7 |
     |-------------------------------------------|
  6. |  4        1   15    6    .    .    .    . |
  7. |  5        1   15    5   15    6    .    . |
     +-------------------------------------------+

Here fol_id just becomes a variable name suffix and it is redundant given that.

If you really need those extra variables, this works too with your data example.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(id sub_id fol_id a b)
1 1 1 15 6
2 1 1 20 6
2 2 1 25 7
3 1 1 25 6
3 2 1 15 7
3 2 2 20 7
3 2 3 25 7
4 1 1 15 6
5 1 1 15 5
5 1 2 15 6
end

clonevar which=fol_id 

reshape wide a b fol_id, i(id sub_id) j(which)

list

Announcement

partly transpose the dataset

Comment

id	sub_id	fol_id1	A1	B1	fol_id2	A2	B2	fol_id3	A3	B3
1	1	1	15	6
2	1	1	20	6
2	2	1	25	7
3	1	1	25	6
3	2	1	15	7	2	20	7	3	25	7
4	1	1	15	6
5	1	1	15	5	2	15	6