Using assert with missing observations

Cynthia Inglesias

Join Date: Aug 2015

Posts: 86
#1

Using assert with missing observations

29 May 2016, 09:33

Dear all

Suppose i have a dataset with 5 variables. var1 and var2 have 100 observations while var3, var4 and var5 look like this:

Code:

var3 var4 var5 54 12 . 56 15 167 89 17 190 34 18 198 . . . . . . . . . . . .

I am trying to use assert to see if a condition is satisfied but i do not get the desired result. I have tried with both !missing() and without:

Code:

assert var4 < var4[_n+1] & var5 > var5[_n+1] assert var4 < !mi(var4[_n+1]) & !mi(var5) > !mi(var5[_n+1])

The assertion ought to be true but i suspect the missing values are interfering.

How can i get the right answer in the above example?
Tags: None
Cyrus Levy

Join Date: Nov 2014

Posts: 99
#2

29 May 2016, 09:55

You can try using if conditions. Something like this:

assert var4 < var4[_n+1] if var4!=.

This shouldn't produce any output which indicates a true assertion. That's how the command is designed.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35437
#3

29 May 2016, 09:57

The function missing() or mi() returns 1 or 0 depending on whether its argument is missing or not missing;and its negation returns therefore 0 or 1 respectively. The results of those evaluations, 0 or 1, are being used in your syntax.

I guess your problem is not so much "missing values interfering", whatever that means to you, but rather misunderstanding the syntax radically. You seem to be thinking that !mi() is a wrapper that instructs Stata to ignore the missing values. Not so; as said: it's the job of mi() to report whether its argument is missing.

You may want something something more like

Code:

assert var4 < var4[_n+1] & var5 > var5[_n+1] if !missing(var4, var5)

as it's the job of the if qualifier to specify whatever you want to use, or to ignore.

However, I am guessing everywhere, as nowhere do you tell us precisely what you are looking for, just testing "a condition".

Several of your posts show similar problems!

Last edited by Nick Cox; 29 May 2016, 10:38.
Comment
Cynthia Inglesias

Join Date: Aug 2015

Posts: 86
#4

29 May 2016, 10:14

Thank you both for taking the time to reply. Nick you are correct in that i had misunderstood the syntax of assert and i now see your point. But we all learn by doing.

Apologies if i have not been clear. The condition here that i am trying to test is that each observation of var4 is smaller than the one immediately after, while at the same time the opposite is true for var5 (each observation is larger than the one immediately after).

Last edited by Cynthia Inglesias; 29 May 2016, 10:16.
Comment
Cynthia Inglesias

Join Date: Aug 2015

Posts: 86
#5

29 May 2016, 10:36

On a related question, if the assertion is false can Stata repeat a block of code x times until the assertion becomes true? For example, i calculate var5 then check whether each observation is smaller than the next. If this is false, i drop the relevant observation and recalculate var5. Then check the assertion again, if true it stops if not it repeats the process again.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35437
#6

29 May 2016, 10:40

Stata can do that but not with this code. The assert command is here tested for every observation. You need something else, and it sounds like a loop over observations. You wouldn't use assert but an if command.
Comment
Cynthia Inglesias

Join Date: Aug 2015

Posts: 86
#7

29 May 2016, 10:55

Thanks for the pointers Nick. It sounds complicated but i will give it a try as i need to this. I will share the solution here if i manage to pull this off.
Comment
Cyrus Levy

Join Date: Nov 2014

Posts: 99
#8

29 May 2016, 11:14

You can fine-tune something like this in a do program:

local i = 0
while `i' == 0 {
gen var5 = whatever-it-is
capture assert var5 > var5[_n+1] if var5!=.
if _rc==9 {
keep if var5 > var5[_n+1]
drop var5
}
else local i = 1
}
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35437
#9

29 May 2016, 11:22

Note that Cyrus's code is not a loop over observations, but recomputes the whole of the variable every time around the loop. Although it's rare of me to recommend a loop over observations, on the information that Cynthia has provided she needs something more like this:

Code:

local Nm1 = _N - 1 forval j = 1/`Nm1' { if var5[`i'] > var5[`i'+1] { drop in <wherever> replace var5 = <whatever> in <wherever> } }

This is necessarily vague and incomplete as I don't understand either which is the "relevant observation" (this one or the next?) or what "recalculate" means.

It's entirely possible that there is a much better way of approaching the problem.
Comment
Cyrus Levy

Join Date: Nov 2014

Posts: 99
#10

29 May 2016, 11:25

Agreed: Probably there is a much better way of approaching the problem. It is hard to guess with incomplete information.
Comment
Cynthia Inglesias

Join Date: Aug 2015

Posts: 86
#11

30 May 2016, 08:04

Thank you both very much for your suggestions. The examples i provide here are simple versions of what i am trying to do. The aim is for me to get the general picture so i can then apply these in more specific and complicated environments.

That said, Cyru's approach is simple and appears to do the trick. But you Nick you are (as usual) right in that this is not very efficient. But for few variables and observations it will be ok i think. I am now trying to do the same by looping over observations as my goal is to learn how to do it efficiently too.
Comment

Announcement

Using assert with missing observations

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment