problem with the variable _merge after joinby

Ylenia Curci

Join Date: Sep 2017

Posts: 72
#1

problem with the variable _merge after joinby

27 May 2021, 16:08

Hello, I am using stata 15 and I am having an issue with joinby. After I merge the two database I have 14 observation "only in using data". Then I type list id if _merge ==2 and I get a list with 14 id. So far so good.

But then, if I browse one of the id in that list, its variable _merge says "only in master data".

How on earth is this possible????

Thank you for you help,

Ylenia
Tags: None
Ken Chui

Join Date: Aug 2014

Posts: 1057
#2

27 May 2021, 20:30

Make sure when you are browsing, you're finding the case under the variable called "id" and not the first column of the browser. That is row number, which has nothing to do with the id.

A way to check is, instead of manually browsing, take a few id, and then type something like (2, 5, 17, 48 are my made up IDs):

Code:

browse id _merge if inlist(id, 2, 5, 17, 48)

Last edited by Ken Chui; 27 May 2021, 20:33.
Comment
Ylenia Curci

Join Date: Sep 2017

Posts: 72
#3

28 May 2021, 00:49

Hi Ken,
actually I am not browsing the whole database, I wrote br if id=="xxx". So I am sure I am looking at the right observation. Moreover the id is a name, not a number, so pretty impossible to make the mistake you suggested.
Any other idea?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10027
#4

28 May 2021, 02:19

I would update Stata first and see if the issue persists.

Code:

update all

Also, you should check if you have an inconsistency if you browse specifying a specific value of the variable _merge. If you have very long identifiers (numbers), it may be a precision issue.

Code:

browse if _merge==2
Comment
Ylenia Curci

Join Date: Sep 2017

Posts: 72
#5

28 May 2021, 04:13

Yes, there was an inconsistency, a bloody extra space after the end of the id! I think the forum really needs a facepalm emoticon.

Thank you.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29914
#6

28 May 2021, 21:13

Everybody who works with string variables needs to be familiar with the trim() and itrim() functions. When I create a data set one part of my do file is almost always:

Code:

ds, has(type string) foreach v in `r(varlist)' { replace `v' = trim(itrim(`v')) }

precisely to avoid such problems. Of course, you have to be sure that padded blanks are not actually meaningful distinctions in your data--but that is rarely the case, and is also a really bad data practice.
1 like
Comment

Announcement

problem with the variable _merge after joinby

Comment

Comment

Comment

Comment

Comment