Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Delete duplicates in a specific situation

    Hi all,

    I am trying to merge two datasets and Stata doesn't allow me because I have duplicate individuals. I'll go straight to the issue:

    My data looks like this:
    PersonID YearRole Role1_dummy Role2_dummy
    A 2015 0 1
    A 2013 1 0
    A 2014 1 0
    B 2013 1 0
    B 2017 0 1
    Note how person A has two role 1 in different years. In this case, I would only like to keep the most recent one (2014).

    What I want my data to look like is the following:
    PersonID Year Role1_dummy Role2_dummy
    A 2013 1 0
    A 2014 1 0
    A 2015 0 1
    A 2016 0 0
    A 2017 0 0
    B 2013 1 0
    B 2014 0 0
    B 2015 0 0
    B 2016 0 0
    B 2017 0 1
    As you can see, here I have 1 observation per person and year. When a person doesn't have either Role1 or Role 2 in a given year (see person A in 2016-2017), my 2 dummies (Role1_dummy and Role2_dummy) will have values of 0.


    How could I solve this?

    Thank you so much.

    Best,

    Carla


  • #2
    You don't properly present your datasets using dataex, and in fact you don't even give code as the FAQ asks you to, but if I understand the issue, this seems like problem that a many to 1 merge would solve.

    Comment


    • #3
      This seems to produce what you ask for.
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str1 personid int yearrole byte(role1_dummy role2_dummy)
      "A" 2015 0 1
      "A" 2013 1 0
      "A" 2014 1 0
      "B" 2013 1 0
      "B" 2017 0 1
      end
      rename yearrole year
      encode personid, generate(pid)
      drop personid
      order pid
      xtset pid year
      tsfill, full
      replace role1_dummy = 0 if missing(role1_dummy)
      replace role2_dummy = 0 if missing(role2_dummy)
      list, clean abbreviate(16)
      Code:
      . list, clean abbreviate(16)
      
             pid   year   role1_dummy   role2_dummy  
        1.     A   2013             1             0  
        2.     A   2014             1             0  
        3.     A   2015             0             1  
        4.     A   2016             0             0  
        5.     A   2017             0             0  
        6.     B   2013             1             0  
        7.     B   2014             0             0  
        8.     B   2015             0             0  
        9.     B   2016             0             0  
       10.     B   2017             0             1

      Comment

      Working...
      X