recode to missing

Euslaner

Join Date: Apr 2014

Posts: 189
#1

recode to missing

04 Jun 2018, 09:43

I have done this before, but now I can't.

I typed:

recode mip (-1 = .)

but Stata (15) tells me:

recode only allows numeric variables

But the manual includes this example:

recode x (9=.a) (8=.), gen(z)

What is going wrong? Thanks for any help.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

04 Jun 2018, 10:14

The message tells you waht the problem is: -recode- only works with numeric variables, not string variables. -recode- has no problem with recoding to missing. But it won't work with mip because mip is a string variable. That's what it's telling you.
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#3

04 Jun 2018, 10:45

It follows that mip is likely to be a candidate for destring. You're thinking it is, or should be, numeric, but somehow it's string.

Code:

tab mip if missing(real(mip))

should show values that led to a mis-reading at some point.
Comment
Euslaner

Join Date: Apr 2014

Posts: 189
#4

04 Jun 2018, 10:45

Thanks. I had no reason to expect it to be string, so I didn't check.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#5

04 Jun 2018, 11:47

Let me digress from this particular question per se to a more general principle. It is common on Statalist to see posts where Stata's error message asserts that the data set not suitable for the command. Perhaps the most frequent example is where somebody wants to -merge 1:1-, or -merge m:1-, or -merge 1:m-, and Stata refuses because the specified merge key variable(s) does (do) not uniquely identify the observations as required. But there are plenty of others.

Some of these questions may arise because the language of the error message is unfamiliar to a new user. "uniquely identify" is somewhat jargonish and its meaning may not be obvious if you haven't heard it before. But it often becomes clear in the thread that the user simply doesn't entertain the idea that the data set is the source of the problem. Now, there are certainly situations where Stata's messages are misleading or confusing. But I have to say that when Stata complains about the data, Stata is usually right.

Moral of the story: if Stata complains about the data not meeting the requirements of a command, start by assuming that Stata has it right and check your data. Stata rarely gets this kind of thing wrong. Data sets, by contrast, frequently contain surprises. Even the most carefully curated data sets obtained from the most reliable sources have a high probability of containing outright errors, or having at least some instances of things not being quite as advertised/expected. It's just the way of the world. And particularly in a situation where a command has worked previously and now doesn't, look first to the data as the source of the problem.

Be skeptical of your data. No matter how well you think you know your data set, if Stata demurs, check it out: we generally don't know our data sets as well as we think/hope/would like.
Comment

Announcement

Comment

Comment

Comment

Comment