With example data below, what is the most efficient way to produce a dataframe that contains just one row per unique entry of the four "place" variables? I have only used collapse with numeric variables before, so I'm not sure if this is even possible. Otherwise perhaps there is some way to flag duplicates and then drop based on that new flag variable?
Code:
* Example generated by -dataex-. For more info, type help dataex clear input byte(id survey_wave) str9 place1 str13 place_clean1 str10 place2 str15 place_clean2 1 1 "harvard" "Harvard" "" "" 1 2 "harvard" "Harvard" "" "" 1 3 "harvard" "Harvard" "" "" 1 4 "" "" "" "" 1 5 "aiello" "Aiello's" "ming" "Ming Restaurant" 2 1 "uchicago" "U of Chicago" "" "" 2 3 "uchicago" "U of Chicago" "" "" 2 4 "uchicago" "U of Chicago" "" "" 2 4 "uchicago" "U of Chicago" "" "" 2 5 "uchicago" "U of Chicago" "" "" 3 1 "aiellos" "Aiello's" "" "" 3 1 "aiello" "Aiello's" "" "" 3 2 "none" "" "" "" 3 2 "our place" "Our Place" "sheetz" "Sheetz" 3 3 "our place" "Our Place" "sheetz" "Sheetz" 3 5 "our place" "Our Place" "" "" 4 1 "uchicago" "U of Chicago" "jiffy lube" "Jiffy Lube" 4 1 "aiello" "Aiello's" "jiffy lube" "Jiffy Lube" 4 2 "truck co" "Truck Company" "jiffy lube" "Jiffy Lube" 4 2 "truck co" "Truck Company" "jiffy lube" "Jiffy Lube" 4 3 "truck co" "Truck Company" "" "" end
Comment