Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • match with reclink2

    Dear Statalisters:

    I am trying to use the reclink2 to make a match between two databases that are exactly equal. I am trying to do that in order to identify schools that have a similar name and are located in the same address, but it is obvious that will be a perfect match (the same observation). But with the option npairs(2) I am trying to also see other schools with similar name and location, to identify potentials duplicates. I suspect that several persons have added observations to this geodatabase.

    This is the lines that I am using. The result is only perfect matches, but I am using the option npairs(2).

    What may be the problem?.

    clear
    cd "C:\Users\hsantos\Desktop\BID\Haití\GIS Haití\Imperfect match"
    use "Base GIS"
    gen idm=_n
    sort idm
    reclink2 eco_nom eco_locali eco_rue using "Base GIS copy.dta", gen(myscore) idm(idm) idu(objectid) manytoone npairs(2)

    Added: objectid= identifier from Base GIS copy.dta myscore = matching score
    Observations: Master N = 15920 Base GIS copy.dta N= 15920
    Unique Master Cases: matched = 15920 (exact = 15920), unmatched = 0

    Best,

    Humberto

  • #2
    Hi Humberto,

    the problem is that reclink2 delets all perfect matches, i.e. you cannot use those for imperfect matches anymore.
    I had a similar problem and I solved it the following way:
    The idea is to loop through all schools.

    1) Open your data and generate an school identifier from 1 to N and save it (BASIC data).
    2) Select the first school (school1), drop this schools and save the data. This will be your first USING data.
    3) Open the BASIC data again, this time keep school 1 and drop all the other schools. This is your MASTER data.
    4) Do the reclink2 command and save your results.

    Loop through all schools and append the reclink results.

    I hope this helps.
    Best,
    Christoph.

    Comment


    • #3
      -matchit- (available at ssc) can be used exactly for this kind of problems.
      Someone has asked me the above question in a private message, but I guess the answer may interest other Statalist users. Moreover, other people may have a

      Comment

      Working...
      X