Duplicates drop with prioritization on one value of a variable

Adrij Chakraborty

Join Date: Oct 2023

Posts: 16
#1

Duplicates drop with prioritization on one value of a variable

06 Mar 2024, 09:34

Hi, I am new to Statalist, so let me know if I have missed out on any submission protocols that make it easier for people to understand and respond to my query.

I am trying to use duplicates drop to drop duplicates. I know that the 'duplicates drop' command keeps only the first observation from the group of duplicates. But here, I want to drop the other observations from the group of duplicates and retain the one particular observation that is not the first occurrence in its group. Is that possible with 'duplicates drop'?

I'm using Stata 18.

Last edited by Adrij Chakraborty; 06 Mar 2024, 09:41.
Tags: None
Ken Chui

Join Date: Aug 2014

Posts: 1058
#2

06 Mar 2024, 09:41

You may create a variable to indicate that the case has no "11" in the variable, and then apply the duplicates drop after sorting. The cases with "11" will receive a "0" in the "not_11" indicator, and will be sorted to the top and thus retained:

Code:

gen not_11 = (erwstat != 11) gsort persnr_siab begorig endorig not_11 duplicates drop persnr_siab begorig endorig, force

Last edited by Ken Chui; 06 Mar 2024, 09:43.
Comment
Adrij Chakraborty

Join Date: Oct 2023

Posts: 16
#3

06 Mar 2024, 10:06

Thanks Ken, it worked wonderfully.
Comment

Announcement