Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Removing duplicate ID's by creating additional variable

    Hi All,

    I would like to change this dataset such that each ID only appears one time, with additional columns representing "Father" and "Mother." For example, ID 4001 appears twice, where "PTYPE" (parent type) has two entries -- F and M. How can I create new columns "Father" and "Mother" where there is a 1 or 0 entry? After this, how can I compress the data, such that each ID only appears one time? I would like to do the same thing with the variable "PNP," so please ignore that column for now.

    Many thanks in advance,
    Cora


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(ID68 PN) byte GID float(ID68P PNP) str2 PTYPE float ID
    4  1 2 4 906 "F" 4001
    4  1 2 4 907 "M" 4001
    4  2 2 4 908 "F" 4002
    4  2 2 4 909 "M" 4002
    4  3 3 4   1 "F" 4003
    4  3 3 4   2 "M" 4003
    4  4 3 4   1 "F" 4004
    4  4 3 4   2 "M" 4004
    4  5 3 4   1 "F" 4005
    4  5 3 4   2 "M" 4005
    4  6 3 4   1 "F" 4006
    4  6 3 4   2 "M" 4006
    4  7 3 4   1 "F" 4007
    4  7 3 4   2 "M" 4007
    4  8 3 4   1 "F" 4008
    4  8 3 4   2 "M" 4008
    4 30 4 4   5 "M" 4030
    end

  • #2
    I'm a bit confused by your description of what you want to do, but I think what you want is:
    Code:
    reshape wide PNP, i(ID) j(PTYPE) string
    If that's not it, please post back with a mock-up of what you want the result to actually look like.

    Comment


    • #3
      Many thanks! Reshaping was what I needed.

      Comment

      Working...
      X