I have two datasets that I am trying to merge. In Dataset 1 I have participant ID's (variable name is Participant_ID) that are a concatenation of first initial, last initial and date of birth in the format of two digit month, two digit day, and 4 digit year without any slashes or dashes in between; see the first dataex below. For example, someone born on January 1st, 2001 with initials AA would be AA01012001 in this dataset. However, for dataset 1 I do NOT have separate variables for first intial, last initial, and date of birth.
In Dataset 2, I have separate variables for First_Initial, Last_Initial and DOB (date of birth). The format of the data in DOB is not consistent. Some entries have month as 1 digit while other entries have month as 2 digits. For example, some might have January as just a "1" while others may have it as "01" (see the first and second observations in the dataex for Dataset 2 below). The same thing is true for the day of the month. To create a participant ID, I will need to concatenate the three variables in a way that the format matches the format of dataset 1.
Due to the inconsistencies in how the dates of birth are entered in dataset 2, I am stumped regarding how to generate a Participant_ID variable from dataset 2 that will allow me to merge with dataset 1 and I appreciate any advice on this matter.
Dataset 1 example:
Dataset 2:
Best,
Alyssa Beavers
In Dataset 2, I have separate variables for First_Initial, Last_Initial and DOB (date of birth). The format of the data in DOB is not consistent. Some entries have month as 1 digit while other entries have month as 2 digits. For example, some might have January as just a "1" while others may have it as "01" (see the first and second observations in the dataex for Dataset 2 below). The same thing is true for the day of the month. To create a participant ID, I will need to concatenate the three variables in a way that the format matches the format of dataset 1.
Due to the inconsistencies in how the dates of birth are entered in dataset 2, I am stumped regarding how to generate a Participant_ID variable from dataset 2 that will allow me to merge with dataset 1 and I appreciate any advice on this matter.
Dataset 1 example:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str10 Participant_ID "AW01032009" "BB01142013" "WZ05082012" "ZZ01072011" "CW12052011" end
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str10 DOB str13 First_Initial str12 Last_Initial "01/03/2009" "A" "W" "1/14/2013" "B" "B" "05/8/2012" "W" "Z" "1/7/2011" "Z" "Z" "12/05/2011" "C" "W" end
Alyssa Beavers
Comment