reclink perfect match error

Hyejun Kim

Join Date: Mar 2024

Posts: 1
#1

reclink perfect match error

01 Mar 2024, 03:29

Hi,

I'm using reclink to match the names between two big datasets (no other variables; just one variable that is composed of one word).

Here's the command:
reclink v1 using usingfile, idm(idm) idu(idu) gen(score)

While it works well in general (for >100,000 names), I found that there are some observations where the score is 1 (i.e., perfect match) but the matched/matching names are quite different. Examples include the following:

word in mater file: ssssssssssid
word in using file: hookless

word in mater file: booooooo
word in using file: nohood

Does anyone know what's happening here? Or, if you know any other ways that work well with this type of one-word-matching, I welcome your suggestions!
Tags: None
Cortnie Shupe

Join Date: Dec 2015

Posts: 11
#2

12 Dec 2024, 14:16

Hi Hyejun, Did you ever figure out an answer to your question? I am using dtalink from Keith Kranker, https://ideas.repec.org/c/boc/bocode/s458504.html. It is similar to reclink and I am having the same issue. I am linking two datasets and even though I use the bestmatch option, I have many instances in which three records from the using data are flagged as matches to the same record in the master data, with a high match rate although two of these records do not have any matching variables in common with the master record. I can clearly see one good match and two extra records with the maximum score that do not match at all. Curious whether you found a solution that may apply to my issue as well.
Comment

Announcement

reclink perfect match error

Comment