Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to delete row in Stata that the frequency of number appear less than 2 times

    Hi, how to delete all the row that the number only appear one times.
    For example, in the below table, in the hhid column, 3 and 14 appears twice, but 20 and 23 only appear once. So is there any command that can help me with that.
    Attached Files

  • #2
    Well, it is easy enough to remove one of the duplicates, but look at the other variables. For the two observations with hhid == 3, the values of the other variables differ. The same is true for the two observations with hhid == 14, or 24, or, apparently, pretty much all of them. So what do you want to do with those other variables? Which observation of each pair do you want to keep? Or do you want to combine the values of the other variables in some way? What would the result you want look like?

    Comment


    • #3
      I think OP wants to remove single values, not duplicates. So if I understand OP would like to remove 20 and 23 because they appear once. I don't have access to Stata at the moment, but maybe something along these lines:

      Code:
      bysort hhid: drop if _N == 1
      Or if that doesn't work I think something like this should:

      Code:
      bysort hhid: gen count = _N
      drop if count == 1
      If I understand correctly...

      Comment


      • #4
        Ah, I think Daniel Schaefer is right, and I misread the original post. Sorry for any confusion I have caused.

        Comment


        • #5
          Yep, that is absolutely what I want. Thank you very much for all you help.

          Comment


          • #6
            Code:
            bysort hhid: drop if _N == 1
            should work too.

            Comment

            Working...
            X