Removing duplicate ID's by creating additional variable

Cora Touchstone

Join Date: Jan 2022

Posts: 25
#1

Removing duplicate ID's by creating additional variable

26 Jan 2022, 13:41

Hi All,

I would like to change this dataset such that each ID only appears one time, with additional columns representing "Father" and "Mother." For example, ID 4001 appears twice, where "PTYPE" (parent type) has two entries -- F and M. How can I create new columns "Father" and "Mother" where there is a 1 or 0 entry? After this, how can I compress the data, such that each ID only appears one time? I would like to do the same thing with the variable "PNP," so please ignore that column for now.

Many thanks in advance,
Cora

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input float(ID68 PN) byte GID float(ID68P PNP) str2 PTYPE float ID 4 1 2 4 906 "F" 4001 4 1 2 4 907 "M" 4001 4 2 2 4 908 "F" 4002 4 2 2 4 909 "M" 4002 4 3 3 4 1 "F" 4003 4 3 3 4 2 "M" 4003 4 4 3 4 1 "F" 4004 4 4 3 4 2 "M" 4004 4 5 3 4 1 "F" 4005 4 5 3 4 2 "M" 4005 4 6 3 4 1 "F" 4006 4 6 3 4 2 "M" 4006 4 7 3 4 1 "F" 4007 4 7 3 4 2 "M" 4007 4 8 3 4 1 "F" 4008 4 8 3 4 2 "M" 4008 4 30 4 4 5 "M" 4030 end
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

26 Jan 2022, 13:47

I'm a bit confused by your description of what you want to do, but I think what you want is:

Code:

reshape wide PNP, i(ID) j(PTYPE) string

If that's not it, please post back with a mock-up of what you want the result to actually look like.
Comment
Cora Touchstone

Join Date: Jan 2022

Posts: 25
#3

28 Jan 2022, 14:18

Many thanks! Reshaping was what I needed.
Comment

Announcement

Removing duplicate ID's by creating additional variable

Comment

Comment