Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • drop the top 1 percentile with regards to only the positive observations

    I want to drop the top 1 percentile with regards to only the positive observations, I am new to STATA so this might too simple for this forum. I could not find an answer on the forum so:

    set seed 630550
    drop if hour89mo > 55000
    summarize if hour89mo > 0


    keep if inrange( hour89mo , r(p1) , r(p99)) however this syntax does not keep the values which are zero

    Thanks in advance




  • #2
    Your syntax does not consider the values that are zero because you are excluding them with the -summarize-
    Code:
    summarize hour89mo if hour89mo >= 0

    Comment


    • #3
      Welcome to Statalist.

      The output of help inrange tells us that the second and third arguments are the lowest and highest values of the range. The highest value you want is the 99th percentile returned by the summarize command in r(p99), but the lowest value should be 0, not the first percentile returned by summarize in r(p1). Try
      Code:
      keep if inrange( hour89mo , 0 , r(p99) )

      Comment


      • #4
        Thanks for the replies

        I've changed the syntax to
        keep if inrange( hour89mo , 0 , r(p99) )

        But does this drop the top 1 percentile of only the positive values exluding the zeros,
        in other words the highest 1 percent only with regards to the positive values
        because that is what I want to accomplish

        thanks again

        Comment


        • #5
          Needless - perhaps - to say, dropping extreme values, what is more, selecting a specific outcome from where to drop observations with extreme values, well, this is oftentimes taken as a far cry from the best approach.
          Best regards,

          Marcos

          Comment


          • #6
            Cross-posted on https://stackoverflow.com/questions/...e-observations -- except that the question there seems to be different and refers to dummy variables.

            1. We ask that you tell us about cross-posting.

            2. Far from the question being too simple, it is far from clear what it is.

            Comment


            • #7
              Code:
              summarize hour89mo if hour89mo > 0
              keep if inrange( hour89mo , 0 , r(p99) )
              does what you need, i.e. exclude the highest 1 percent only with regards to the positive values

              Comment

              Working...
              X