how to delete row in Stata that the frequency of number appear less than 2 times

Tin Cheng

Join Date: Mar 2024

Posts: 11
#1

how to delete row in Stata that the frequency of number appear less than 2 times

31 Mar 2024, 16:06

Hi, how to delete all the row that the number only appear one times.
For example, in the below table, in the hhid column, 3 and 14 appears twice, but 20 and 23 only appear once. So is there any command that can help me with that.
Attached Files
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

31 Mar 2024, 17:48

Well, it is easy enough to remove one of the duplicates, but look at the other variables. For the two observations with hhid == 3, the values of the other variables differ. The same is true for the two observations with hhid == 14, or 24, or, apparently, pretty much all of them. So what do you want to do with those other variables? Which observation of each pair do you want to keep? Or do you want to combine the values of the other variables in some way? What would the result you want look like?
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 806
#3

31 Mar 2024, 17:55

I think OP wants to remove single values, not duplicates. So if I understand OP would like to remove 20 and 23 because they appear once. I don't have access to Stata at the moment, but maybe something along these lines:

Code:

bysort hhid: drop if _N == 1

Or if that doesn't work I think something like this should:

Code:

bysort hhid: gen count = _N drop if count == 1

If I understand correctly...
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#4

31 Mar 2024, 18:13

Ah, I think Daniel Schaefer is right, and I misread the original post. Sorry for any confusion I have caused.
Comment
Tin Cheng

Join Date: Mar 2024

Posts: 11
#5

01 Apr 2024, 04:24

Yep, that is absolutely what I want. Thank you very much for all you help.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35211
#6

01 Apr 2024, 04:38

Code:

bysort hhid: drop if _N == 1

should work too.
Comment

Announcement

how to delete row in Stata that the frequency of number appear less than 2 times

Comment

Comment

Comment

Comment

Comment