Hi Marc,
I'm a little bit confused by your question. Why do you need to include the year and GVKEY into the fuzzy match? Do you think there might be typos in these variables?
If this is not the case I suggest the following strategy:
1. keep just unique GVKEY and name pairs from both files, join them by gvkey
2. Run matchit using the column syntax
3. Drop the "bad" matches (manual inspection is recommended)
4. Merge back the resulting file with master (or smaller file)
5. Merge back with the using (or larger file) adding the year as additional merge condition.
I think this is the fastest way to do it. The code below implements this choice assuming your files are named master.dta and using.dta:
I'm a little bit confused by your question. Why do you need to include the year and GVKEY into the fuzzy match? Do you think there might be typos in these variables?
If this is not the case I suggest the following strategy:
1. keep just unique GVKEY and name pairs from both files, join them by gvkey
2. Run matchit using the column syntax
3. Drop the "bad" matches (manual inspection is recommended)
4. Merge back the resulting file with master (or smaller file)
5. Merge back with the using (or larger file) adding the year as additional merge condition.
I think this is the fastest way to do it. The code below implements this choice assuming your files are named master.dta and using.dta:
Code:
tempfile masterunique use master.dta, clear keep CFO_Name Acq_ID_Compustat rename Acq_ID_Compustat GVKEY duplicates drop save `masterunique' use using.dta, clear keep GVKEY Director_Name duplicates drop joinby GVKEY using `masterunique' matchit Director_Name CFO_Name, score(minsimple) // matchit Director_Name CFO_Name, w(log) g(similwgt) score(minsimple) // if not to big use this one // use this to check the data gsort GVKEY -similscore br if similscore>.2 keep if similscore>.7 // check first if this threshold makes sense to your data // merge back with master file gen Acq_ID_Compustat=GVKEY joinby Acq_ID_Compustat CFO_Name using master.dta // merge back with using file (BUT ONLY ON THE MATCHING YEARS) gen YEAR=Deal_Announced joinby GVKEY Director_Name YEAR using using.dta save megamerge.dta
Comment