I am looking for a help on how to calculate Levenshtein distance between all possible combinations of words from 2 columns (STRDIST is a module to calculate the Levenshtein distance). Then get the output in 2 formats. The first one is all possible pairs and the Levenshtein distances (basically 3 columns). Another is a matrix of 2 variables (N x N matrix), so the Levenshtein distances are written in the matrix. These outputs are exported as excel or csv format. My data look like below.
Thank you very much
Thank you very much
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str80 word1 str69 word2 "11-VITAMIN" "11-VITAMIN" "3 B" "3 B" "3 B FORTE" "3 B FORTE" "3 BEE VITAMINS" "3 BEE VITAMINS" "3-VITABEE" "3-VITABEE" "3-VITADON" "3-VITADON" "3.TRIGYNO, FOR EXPORT TO VIETNAM" "3.TRIGYNO, FOR EXPORT TO VIETNAM" "693 MALBAC (MB)" "693 MALBAC (MB)" "9 VITAMINS" "9 VITAMINS" "A-CLAV 1000" "A-CLAV 1000" "A-CLAV 375" "A-CLAV 375" "A-CLAV 625" "A-CLAV 625" "A-CNOTREN" "A-CNOTREN" "A-CNOTREN®" "A-CNOTREN®" "A-ROXIME 125" "A-ROXIME 125" "A-ROXIME 250" "A-ROXIME 250" "A-ROXIME 500" "A-ROXIME 500" "A-TUSSIN TABLETS" "A-TUSSIN TABLETS" "A MAGSIL TABLETS" "A MAGSIL TABLETS" "A MOXI T.O. 500" "A MOXI T.O. 500" end
Comment