Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Duplicates drop with prioritization on one value of a variable

    Hi, I am new to Statalist, so let me know if I have missed out on any submission protocols that make it easier for people to understand and respond to my query.

    I am trying to use duplicates drop to drop duplicates. I know that the 'duplicates drop' command keeps only the first observation from the group of duplicates. But here, I want to drop the other observations from the group of duplicates and retain the one particular observation that is not the first occurrence in its group. Is that possible with 'duplicates drop'?

    I'm using Stata 18.
    Last edited by Adrij Chakraborty; 06 Mar 2024, 10:41.

  • #2
    You may create a variable to indicate that the case has no "11" in the variable, and then apply the duplicates drop after sorting. The cases with "11" will receive a "0" in the "not_11" indicator, and will be sorted to the top and thus retained:

    Code:
    gen not_11 = (erwstat != 11)
    gsort persnr_siab begorig endorig not_11
    duplicates drop persnr_siab begorig endorig, force
    Last edited by Ken Chui; 06 Mar 2024, 10:43.

    Comment


    • #3
      Thanks Ken, it worked wonderfully.

      Comment

      Working...
      X