Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using summary results such as r(mean) in the logical expression

    Hello, Statalist users,

    Using Stata 12.1, I am trying to use summary results in the following logical expression, which generates error message "invalid syntax, r(198)".

    gen miss = missing(var)
    sum miss
    drop miss if (r(mean) == 0 & r(sd) == 0)

    How should I fix these lines of code? Any help would be highly appreciated.

    Thanks in advance.

  • #2
    You need to reference the values correctly. In this case you need to reference the value as a local macro. Other times results can be returned as scalars or matrices.

    Comment


    • #3
      I see more than one error in your syntax. first you try to summarise the percent of missing values of var, this is fine. However,
      1. you are trying to drop variable based on some condition. this is illegal since you can't drop variable for some observation but keep it for other. but you can drop observations with condition or drop variables, read more in the help file of drop.
      2. Since I think what you would like to do is to drop some observations, your condition is problematic. You are asking to drop observation if the value of some variable is equal to the mean and the sd. The chance to find observation that satisfy this condition is almost none, not only in the case of dummy variable but in any other situation. In the case of dummy variable (0,1) you won't find any observation with value equal to the mean (and/or the sd) so even if you write the syntax correctly you won't get any result of it.

      Comment


      • #4
        Following on Oded's analysis:

        The first statement is loose: nothing in the code yields percent of missings directly, although it is a step towards that.

        I agree with #1. This is actually the problem here. You can drop variables or drop observations using if or in, but not both at the same time. This is a matter of the syntax of drop and nothing at all to do with using saved results.

        I guess differently on #2. The intent seems otherwise. The function missing() returns 1 or 0 according as its arguments are or are not missing. The mean of the results is 0 if and only if all values are non-missing. If that case sd of the results is also 0, a condition which is superfluous.. So, the intent appears to be to write

        Code:
         
        if r(mean) == 0 drop miss
        presumably because if none of the values is missing then there is no need to have an indicator variable for missing.

        There are better ways to proceed here. For example, you could count missings and then create an indicator variable if and only if there are some missings:

        Code:
        count if missing(var) 
        if r(N) > 0 gen miss = miss(var)
        However, I am speculating.

        Note also that findname (SJ) lets you find variables with none missing, some missing and all missing values.




        Comment


        • #5
          First, thank you very much for all the helps!

          In fact, the codes provided by Nick perfectly work for me! Thank you so much for sharing your knowledge, Nick!

          Noelle

          Comment

          Working...
          X