Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop the values of a variable with if condition

    Hello everyone,

    I have a variable named "diversity" (and a bunch of other variables) and I would like to drop the values of the variable "diversity" if the value of diversify is less than 0 (without dropping the values of other variables).

    I tried the below command but I get "invalid syntax" error.

    command:
    drop diversity if diversity<0

    Could someone please advise on how to perform this?

    Thanks,
    Ama

  • #2
    You can set the values to missing and that will effectively drop the variable for that observation while allowing you to keep the other variables for that observation.
    Code:
    replace diversity = . if diversity < 0
    Consider keeping the original variable intact in your dataset and creating a cleaned-up version of the variable for use:
    Code:
    generate double diversity_clean = diversity if diversity >= 0
    You can just add the -if- condition to your commands that use the variable, for example:
    Code:
    tabulate diversity if diversity >= 0
    but you'll need to remember to do this.

    Comment


    • #3
      Thank you very much for the help.

      Comment


      • #4
        @Ama Perera hi,Ama,#2 is right.But another question, I think you have mistaken the use of the command --drop--.if you want to drop the values of the variable "diversity" if the value of diversify is less than 0 .The right code is :

        Code:
        drop if diversity<0

        Kind regards.
        Raymond
        Last edited by Raymond Zhang; 25 Jan 2021, 01:15.
        Best regards.

        Raymond Zhang
        Stata 17.0,MP

        Comment


        • #5
          In Stata variables are the columns, and observations are the rows. One cannot drop a row/observation, but only for some variable--if we drop a row, the row is gone for all variables/columns. So what OP literally asked for is physically impossibly.

          Joseph's solution is setting certain values of the variable to missing. On the other hand what Raymond shows is completely different, it drops the whole row/observation for all the variables/columns.

          Comment


          • #6
            @Joro Kolev Thank you for your clarifying.Yes, indeed. My solution can not satisfy what OP's need. It will also drop the observations of all the other variables.But here I only want to explain why OP's code get "invalid syntax" error.Just a remind.

            Best.
            Raymond
            Best regards.

            Raymond Zhang
            Stata 17.0,MP

            Comment

            Working...
            X