Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • reclink perfect match error

    Hi,

    I'm using reclink to match the names between two big datasets (no other variables; just one variable that is composed of one word).

    Here's the command:
    reclink v1 using usingfile, idm(idm) idu(idu) gen(score)

    While it works well in general (for >100,000 names), I found that there are some observations where the score is 1 (i.e., perfect match) but the matched/matching names are quite different. Examples include the following:

    word in mater file: ssssssssssid
    word in using file: hookless

    word in mater file: booooooo
    word in using file: nohood

    Does anyone know what's happening here? Or, if you know any other ways that work well with this type of one-word-matching, I welcome your suggestions!

  • #2
    Hi Hyejun, Did you ever figure out an answer to your question? I am using dtalink from Keith Kranker, https://ideas.repec.org/c/boc/bocode/s458504.html. It is similar to reclink and I am having the same issue. I am linking two datasets and even though I use the bestmatch option, I have many instances in which three records from the using data are flagged as matches to the same record in the master data, with a high match rate although two of these records do not have any matching variables in common with the master record. I can clearly see one good match and two extra records with the maximum score that do not match at all. Curious whether you found a solution that may apply to my issue as well.

    Comment

    Working...
    X