Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Probabilistic record linkage

    I am trying to run a probabilistic record linkage between two datasets with no common record identifier. I am using the command reclink2. The command is the following:

    reclink2 Firmname add1 pobox unit bldg floor using Amadeus_TOTAL, idmaster( stn_AccountHolderName ) idusing( stn_Nombreempresa ) gen( scorematching )

    However, the following error message appears:

    Going through 7543 observation to assess fuzzy matches, each .=5% complete
    2VALORISEAMEL invalid name
    post: above message corresponds to expression 1, variable stn_AccountHolderName
    r(198);


    The value "2VALORISEAMEL" corresponds to one of the observations of the variable stn_AccountHolderName. But the variable stn_AccountHolderName is a string variable (as required by treclink2):

    d stn_AccountHolderName

    Variable Storage Display Value
    name type format label Variable label
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    stn_AccountHo~e str170 %170s official name


    I do not know how to solve this. Thank you very much for your help.

  • #2
    reclink2 is from the Stata Journal (FAQ Advice #12). I do not use this command, but the error message suggests that the observations need to be valid Stata names. See

    Code:
    help naming_conventions
    You could bypass this by passing the variable through -strtoname()-, but length limits may still bite you.

    Code:
    replace stn_AccountHolderName= strtoname(stn_AccountHolderName)
    See

    Code:
    help strtoname()

    Comment

    Working...
    X