Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue with Missing Value Check for Different Variable Types

    Hello,

    I'm facing an issue with tagging observations with missing values in a dataset containing both numerical and string variables. Variables:

    • _304753 (double)
    • _304754 (double)
    • _304755 (long)
    • _306872 (str14)
    • _309592 (double)
    • _900420 (long)
    • _900565 (long)
    I need to tag observations as "void" if any of these variables have missing values. Here is my code: Code Used for Transformation

    Code:
    gen void = 0
    local bob _304753 _304754 _304755 _309592 _900420 _900565
    foreach var in `bob' {
        replace void = 1 if missing(`var')
    }
    replace void = 1 if _306872 == ""
    Diagnostics Attempted

    • Using missing() function:
    Code:
    replace void = 1 if missing(_304753) & missing(_304754) & missing(_304755) & _306872 == "" & missing(_309592) & missing(_900420) & missing(_900565)
    Result: No missing values considered missing.
    • Using == . for numerical variables:
    Code:
    replace void = 1 if _304753 == . & _304754 == . & _304755 == . & _306872 == "" & _309592 == . & _900420 == . & _900565 == .
    • Result: Syntax error.
    I need to correctly tag observations where all specified variables are missing to remove duplicates and keep valid information for analysis. Any suggestions?

    Thank you!

    Best regards,
    Sami

  • #2
    I can't tell what you are trying to do here. Your first attempted code, using a loop, will mark void = 1 if any of those variables as a missing value. The other two will mark void = 1 if all of those variables have missing values. And on my setup, using a made up data set that meets your description, all three of them run without any error messages.

    I will also note that thee is no need to exclude _306872 from the loop in the first one: it will work just fine even though _306872 is string and the others are numeric. Stata has no problem with that.

    Also, for the second and third attempts, it is not necessary to start with -gen void = 0- and then -replace void = 1...- You can just do
    Code:
    gen void = (_304753 == . & _304754 == . & _304755 == . & _306872 == "" ///
    //     & _309592 == . & _900420 == . & _900565 == .)
    (and analogously for the version using the -missing()- function.)

    So I think you just need to be clear on whether you want void = 1 when any of the variables is missing or when all of them are missing and then just go with one of these existing attempts.

    Comment


    • #3
      See also

      Code:
      help egen
      for its rowmiss() and rownonmiss() functions.

      Comment

      Working...
      X