Dear Statlisters,
I am trying to calculate a pairwise Jaccard similarity measure and have trouble figuring out how to do so. My data is in the following format: the first variable, assignee_id represents the firm, and the other variables (law_1-5) represent their legal partners (dummy variables, a 1 indicating that they have worked with that firm). Now I am trying to calculate the pairwise similarity measure for firms depending on how similar they are in the use of their legal partners. I have been playing around with a few different things but haven't gotten anywhere, so your help with the syntax would be much appreciated. I've attached a data example below
Thanks for your help
I am trying to calculate a pairwise Jaccard similarity measure and have trouble figuring out how to do so. My data is in the following format: the first variable, assignee_id represents the firm, and the other variables (law_1-5) represent their legal partners (dummy variables, a 1 indicating that they have worked with that firm). Now I am trying to calculate the pairwise similarity measure for firms depending on how similar they are in the use of their legal partners. I have been playing around with a few different things but haven't gotten anywhere, so your help with the syntax would be much appreciated. I've attached a data example below
Thanks for your help
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str32 assignee_id byte(law_1 law_2 law_3 law_4 law_5) "00d92f99f43508d37de79da7051b43c7" 1 0 0 0 0 "00e5262f320cbda9f15490debbe80858" 0 1 0 1 0 "031b354668d5ceefc7b4bb3ba57664d4" 0 0 0 1 1 "03810188291c60318b5b0da566c266fb" 0 1 0 1 0 "054d563b447b317f56d940f5e3dd7b39" 1 0 0 0 0 "05695a60b69eb9a0f6e781debe23e9cc" 1 0 0 0 0 "062af6b4d9f7708cfd5e659cd13a3726" 1 0 1 0 0 "081507e638fca84980f88a3c3f5cd1fa" 0 0 0 0 0 "099c2e138f83bf0366539bddfda6b2e2" 0 0 1 0 0 "09fc005ad2872886a676a2f4197ce018" 0 0 0 0 0 "0a00649f54947198768fa954f8756563" 0 0 0 0 0 "0a21a0cbd50fe6558b13d773effc9eb1" 0 0 1 0 0 "0a302a7b505844998614e26c7c26d4a0" 0 1 0 1 1 "0a4642a77d52197c97f5d592966b68d7" 0 1 1 1 1 "0a74e8eea755f3ab33162a52dc87bb5d" 1 0 0 0 0 "0bb9626cc72bbfaf9ae174a022ceb086" 0 1 0 0 0 "0c65f80fcfe79b0c4732a7ebc645da8c" 0 1 0 0 0 "0ceb8b624ea012dea6d0c3705d4f547e" 0 1 1 0 0 "0d5c37ddbc9800bfc84774afe4b36faa" 1 0 0 0 1 "0d5fb33b90b1825b0003a1573d7477fe" 0 0 0 0 0 "0d6c6c25cf34819e50fd97318db9b699" 0 1 0 0 0 "0ee26da954c6572b783432f619a301e3" 1 0 0 0 0 "0f4a6ddb6c4a854440e1123924820706" 0 0 0 0 1 "0fa5a08e051f6bb467854f4bbb913a46" 0 0 1 1 0 "1005528d1a3c548b2403fba94f0927f5" 1 0 0 1 0 "107da3bb737c53c0d39645f72ede8b86" 0 0 0 0 1 "10b108b4ee97bab2304d092590c0bf7c" 0 0 0 0 1 "11127e943b93352979514b124179eb94" 1 0 0 1 0 "11f00a94b4fe1138e00af83137db2fac" 0 0 1 0 0 "134d75dd2f4984f02db90d441336fd2e" 0 1 1 0 0 end
Comment