Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reclink error message

    Hi all,

    Can anyone help me with the error message I get when I try to link two datasets using reclink? The two variables I try to link by are:


    airlinemaster (which is the airline name, for example in the master dataset it could show up as "Air Canada" and in the using dataset it could show up as "Air Canada LTD.")

    and

    codemaster (which is a three or two digit numeric abbreviation in the master dataset, for example "CA" for "Air Canada", and "CCA" for "Air Canada LTD." in the using dataset.


    my code is this:

    reclink airlinemaster codemaster using `wikipedia', gen(myscore) idm(id_master) idu(id_wiki) _merge(merged) minbigram(.80)

    when I run it I get this:

    reclink airlinemaster codemaster using `wikipedia', gen(myscore) idm(id_master) idu(id_wiki) _merge(merged)
    > minbigram(.80)

    9 perfect matches found

    Going through 1682 observation to assess fuzzy matches, each .=5% complete
    ...........) required
    r(100);


    Thanks,
    Marilyn.

  • #2
    Somebody else encountered the same problem recently in a post on this forum. Nobody responded with an answer.

    -reclink- is a user written program, and its author does not, as far as I know, participate in the forum.

    I used to use -reclink- often, but in recent years have had less need for it. I've never encountered this problem myself. There are no missing right parentheses in your command line, so presumably this is happening somewhere internal to -reclink-. A quick check of the ado file finds no point at which -reclink- itself specifically returns an error code of 100. So it is probably that -reclink- is generating something that leads to unbalanced parentheses being passed on to some program that -reclink- itself causes. The possibilities are nearly endless.

    Suggestion: set trace on (probably with a trace depth of 1) and re-run the command. If that doesn't make the problem obvious, post Stata's output in a code block (see FAQ if you're not sure how to do that) and maybe someone will figure it out.

    Comment


    • #3
      The likely cause of the problem is a quotation mark or parenthesis in one of the airline names. You can use regular expressions to try to find the problematic observation, or -set trace on- and wait for the program to exit again, at which point the problem observation should be clear.

      Comment


      • #4
        I realize this is an older post, but I had a similar issue (also involving airline data as it turns out). For me, the error occurred because one airline was listed as "ATLANTIC COAST AIRLINES D/B/A UNITED EXPRESS (STE" in the master data set. To find the issue, I set trace on and waited until the error arose. Then I could correct that specific error in the data.

        That may allow others to get around this problem.

        Cheers

        Comment


        • #5
          Hi all,

          Can anyone help me with the error message I get when I try to merge two datasets using reclink?
          Error message appended below.
          -----------------------------------------------------------------------------------------------------------------
          Going through 6895 observation to assess fuzzy matches, each .=5% complete
          "....option PVT not allowed"

          Kindly Help!

          Regards
          Sandeep.S

          Comment


          • #6
            I don't know how to answer your question completely. But you would be well advised to follow the advice given in the earlier posts of this thread. Run it with trace set on to see where the error is arising and what command Stata is using that thinks it has been given an illegal PVT option. There is a good chance that it will turn out that one of the values of your linking variables contains ", PVT" or something like that, and is causing the confusion.

            Comment


            • #7
              I solved a similar error message by removing any parenthesis in the string used to match on, both in the master data and the using data, and then the error went away. See code example on how to remove the parentheses below:

              replace varname = subinstr(varname , "(", "", .)
              replace varname = subinstr(varname , ")", "", .)


              Try with other special characters if you still encounter problems.

              Best,
              Kristoffer

              Comment


              • #8
                Originally posted by Kristoffer Bjarkefur View Post
                I solved a similar error message by removing any parenthesis in the string used to match on, both in the master data and the using data, and then the error went away. See code example on how to remove the parentheses below:

                replace varname = subinstr(varname , "(", "", .)
                replace varname = subinstr(varname , ")", "", .)


                Try with other special characters if you still encounter problems.

                Best,
                Kristoffer
                Thanks Kristoffer! That works for me.

                Comment


                • #9
                  I know that this is a really old post. For anyone who faces the same problem again, while Kristoffer's method does work, another way around could be to use reclink2 instead of reclink. It's another user-written command available from SSC. It is basically an extension of reclink , so without the npairs() and manytoone options, it should give the same output as reclink. The syntax is identical for both commands. I encountered this problem with reclink, but not with reclink2 (both on the exact same data).

                  Comment

                  Working...
                  X