Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Removing Cases Under Threshold Value

    Hello,

    I'm new to Stata so forgive my ignorance. I have searched the forums for this answer but cannot find it.

    I am cleaning a data set with Stata. I have a categorical variable "cal". I want to remove any categories that have fewer than 15 observations. In other words, if 85 respondents selected "guitar" but only 4 respondents selected "flute", I'd want to remove all cases that responded "flute" but keep the cases that selected "guitar".

    How do I do this?

    Thanks in advance.

  • #2
    I have tried the code:

    drop if cal < 15

    But that is obviously he variable value not the frequency within a category. How do I designate that the 15 refers to a count or frequency?

    Comment


    • #3
      Perhaps this will start you on a useful path.
      Code:
      by cal, sort: generate count = _N
      drop if count<15
      The by prefix causes the command to be run separately for each group of observations with a common value of cal. While _N is usually the number of observations in a dataset, with the by prefix it becomes the number of observations in the group.
      Last edited by William Lisowski; 20 Jun 2020, 20:23.

      Comment


      • #4
        Took care of it. Thanks William!

        Comment

        Working...
        X