Dear All,
I have a dataset wherein there are varying numbers of duplicates (sorted by Applicant Name), (ie. some people might appear 5 times, some appear 3, some appear 1, etc). I used gen dup = cond(_N==1,0,_n) and have sorted these by the date they appear but now I want to only keep the first and last times they appear as duplicates, so I can subtract the first time from the last time and get the duration between appearances.
How can I keep only the first and last duplicates? This would be so much easier if each one appeared an equal amount of times, but they don't.
Any help appreciated!
I have a dataset wherein there are varying numbers of duplicates (sorted by Applicant Name), (ie. some people might appear 5 times, some appear 3, some appear 1, etc). I used gen dup = cond(_N==1,0,_n) and have sorted these by the date they appear but now I want to only keep the first and last times they appear as duplicates, so I can subtract the first time from the last time and get the duration between appearances.
How can I keep only the first and last duplicates? This would be so much easier if each one appeared an equal amount of times, but they don't.
Any help appreciated!
Comment