Logic for the behavior of inrange(z,a,b)

Malcolm Wardlaw

Join Date: Apr 2014

Posts: 46
#1

Logic for the behavior of inrange(z,a,b)

24 Oct 2017, 13:10

I have a question about the logic behind the behavior of inrange(z,a,b) when a, b, or z is missing. My question is just out of curiosity.

The manual states the following:

The following ordered rules apply:
z > . returns 0.
a > . and b = . returns 1.
a > . returns 1 if z < b; otherwise, it returns 0.
b > . returns 1 if a < z; otherwise, it returns 0.
Otherwise, 1 is returned if a < z < b.
If the arguments are strings, "." is interpreted as "".

This seems like very unusual behavior. Why does it follow these rules? Is it a consequence of the implementation being simple/fast or was it written to produce these outcomes as part of some intended logical boolean conditions in some common workflow? If its the latter, what are those logical conditions/workflow?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

24 Oct 2017, 14:52

Well, only the people at StataCorp know for sure. But here's my guess. As you know, missing values in Stata are treated as being greater than any non-missing value by the <, <=, >, >= operators and the -sort- command. But as a result, you can't test for whether a variable is, say, >= 2 by saying:

Code:

whatever if z >= 2

if you don't want to include missing values of z. So, before -inrange()- came along, we always had to code this sort of thing as:

Code:

whatever if z >= 2 & !missing(z)

or some equivalent to that. It's cumbersome since, in practice, it is probably more frequent that people don't want to act on missing values of z when they want to act on values of z >= 2. The logic of -inrange()- gives you this convenience of allowing you to express the desired range of values without having to tack on an extra clause about missings. See:
[code]
. clear

. input float z

z
1. 0
2. 1
3. 2
4. 3
5. 4
6. .
7. end

. count if z >= 2
4

. count if inrange(z, 2, .)
3

.
/[code]
1 like
Comment
Robert Picard

Join Date: Mar 2014

Posts: 1536
#3

24 Oct 2017, 15:47

Something was lost in the copying of the help inrange() entry. Here's what it should look like:

inrange(z,a,b)
Description: 1 if it is known that a ≤ z ≤ b; otherwise, 0

The following ordered rules apply:
z ≥ . returns 0.
a ≥ . and b = . returns 1.
a ≥ . returns 1 if z ≤ b; otherwise, it returns 0.
b ≥ . returns 1 if a ≤ z; otherwise, it returns 0.
Otherwise, 1 is returned if a ≤ z ≤ b.

In terms of notation, . indicates a Stata system missing value. Stata has an additional 26 extended missing values ordered as:

Code:

all nonmissing numbers < . < .a < .b < ... < .z

so the help entry uses ≥. to indicate any missing value.

The inrange() function follows the rules for real intervals specified by endpoints and missing values allow for open bounds. The first ordered rule states that if z is missing, it is not a number and therefore the function will return false no matter the specified bounds. If z is not missing, the second rule states that if both bounds are open, the function returns true. The next two rules indicate what happens if the interval is left-open or right-opened. The final rule restates what you would expect with a bounded interval.

Careful readers will note that the second "ordered rule" should be a ≥ . and b ≥ . returns 1. Here's an example:

Code:

. dis inrange(1, .c, .b) 1
3 likes
Comment
Chris Boulis

Join Date: Feb 2019

Posts: 363
#4

21 Mar 2020, 19:17

Thanks Robert Picard. I found that helpful.
Comment
Simon Turner

Join Date: Oct 2018

Posts: 29
#5

23 Nov 2020, 13:22

Yes, I was just bitten by this strange logic.
I had simulated a bunch of data with confidence intervals and wanted to check coverage, so I applied
covered = inrange(true value, lower CI, upper CI)
It wasn't until some time later that I realised that one of the statistical methods wasn't converging all the time, so I had a few instances where the CIs were missing.
Unfortunately for me, that meant that covered = 1 if there were no confidence intervals (rather than 0 as I would have expected).
e.g. true = inrange(2,.,.)
Won't make that mistake again!
I still think that it's bizarre that it didn't give either a 0 or a missing for that situation!
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2389
#6

23 Nov 2020, 14:11

The subtle logic here is that if -a- and -b- evaluate to missing (.) then the interpretation is to evaluate whether -z- is a real number.
Comment

Announcement

Logic for the behavior of inrange(z,a,b)

Comment

Comment

Comment

Comment

Comment