Hi,
I'm using reclink to match the names between two big datasets (no other variables; just one variable that is composed of one word).
Here's the command:
reclink v1 using usingfile, idm(idm) idu(idu) gen(score)
While it works well in general (for >100,000 names), I found that there are some observations where the score is 1 (i.e., perfect match) but the matched/matching names are quite different. Examples include the following:
word in mater file: ssssssssssid
word in using file: hookless
word in mater file: booooooo
word in using file: nohood
Does anyone know what's happening here? Or, if you know any other ways that work well with this type of one-word-matching, I welcome your suggestions!
I'm using reclink to match the names between two big datasets (no other variables; just one variable that is composed of one word).
Here's the command:
reclink v1 using usingfile, idm(idm) idu(idu) gen(score)
While it works well in general (for >100,000 names), I found that there are some observations where the score is 1 (i.e., perfect match) but the matched/matching names are quite different. Examples include the following:
word in mater file: ssssssssssid
word in using file: hookless
word in mater file: booooooo
word in using file: nohood
Does anyone know what's happening here? Or, if you know any other ways that work well with this type of one-word-matching, I welcome your suggestions!
Comment