Hi Statalist, long time lurker, first time poster.
I am currently trying to clean my data, I began with only full names which were inputted by the people themselves. Therefore there is no consistent input as far as using a full name, initials, etc.
I have gotten as far as what you see below using this code
What I really want is this:
The problem is that there is no consistent way in which people used initials, honorifics, etc... At the most basic level I would just want initials together as first name, then have the rest as last name. Any advice?
I am currently trying to clean my data, I began with only full names which were inputted by the people themselves. Therefore there is no consistent input as far as using a full name, initials, etc.
I have gotten as far as what you see below using this code
Code:
generate contact_peron_first = substr(contact_person, 1, strpos(contact_person, " ") - 1) generate contact_person_last = substr(contact_person,strpos(contact_person, " ") + 1, .) replace contact_person_last = strtrim(contact_person_last) generate contact_person_last_1 = substr(contact_person_last, 1, strpos(contact_person_last, " ") - 1) generate contact_person_last_2 = substr(contact_person_last,strpos(contact_person_last, " ") + 1, .)
Code:
input str45 contact_person str14 contact_peron_first str16 contact_person_last_1 str17 contact_person_last_2 Full Name First Last_1 Last_2 Last_3 "K. V Sarathkumar" "K." "V" "Sarathkumar" "Katungal Padmanabhan Sasidharan" "Katungal" "Padmanabhan" "Sasidharan" "Katungal Padmanabhan Sasidharan" "Katungal" "Padmanabhan" "Sasidharan" "Katungal Padmanabhan Sasidharan" "Katungal" "Padmanabhan" "Sasidharan" "Katungal Padmanabhan Sasidharan" "Katungal" "Padmanabhan" "Sasidharan" "K S Shan" "K" "S" "Shan" "K S Shan" "K" "S" "Shan" "K S Shan" "K" "S" "Shan" "Brij Mohan Sharma" "Brij" "Mohan" "Sharma" "Rajeev K Sivadas" "Rajeev" "K" "Sivadas" "Mr. Tim Sunil" "Mr." "Tim" "Sunil" "C. S. Suresh" "C." "S." "Suresh" end
What I really want is this:
Code:
input str45 contact_person str14 contact_peron_first str16 contact_person_last_1 str17 contact_person_last_2 Full Name First Last_1 Last_2 Last_3 "K. V Sarathkumar" "K.V" "Sarathkumar" "Mr. Tim Sunil" "Tim" "Sunil" "K S Shan" "KS" "Shan"
The problem is that there is no consistent way in which people used initials, honorifics, etc... At the most basic level I would just want initials together as first name, then have the rest as last name. Any advice?
Comment