Hello,
I merged two datasets using several variables and about 85% of the cases merged. Values for some of those characteristics are incorrect in the using dataset (_merge=2), so I am trying to do a second merge after dropping cases where _merge=3 and using the smaller subset of variables `year month day hour', which are accurate. There are duplicates among merge=1 but not among merge=2 or merge=1 and merge=2 pairs. So I am trying to create a count indicator that takes the same value for the merge pairs. Based on the data below, this new count variable would equal:
1
2
3
3
4
4
5
5
6
7
8
9
You can see that it takes the same value for duplicates across merge values (eg rows 3 and 4) but not within merge=1 groups (eg rows 10 and 11)
I think the start of the command will look something like "bys incidentyear incidentmonth incidentday incidenthour (merge1): " but I can't figure out the rest of the command. Do I use egen group? Or just gen?
Once I create the indicator, I will merge again with it to hopefully merge all the remaining cases from the using dataset.
Thank you very much for your help!
I merged two datasets using several variables and about 85% of the cases merged. Values for some of those characteristics are incorrect in the using dataset (_merge=2), so I am trying to do a second merge after dropping cases where _merge=3 and using the smaller subset of variables `year month day hour', which are accurate. There are duplicates among merge=1 but not among merge=2 or merge=1 and merge=2 pairs. So I am trying to create a count indicator that takes the same value for the merge pairs. Based on the data below, this new count variable would equal:
1
2
3
3
4
4
5
5
6
7
8
9
You can see that it takes the same value for duplicates across merge values (eg rows 3 and 4) but not within merge=1 groups (eg rows 10 and 11)
I think the start of the command will look something like "bys incidentyear incidentmonth incidentday incidenthour (merge1): " but I can't figure out the rest of the command. Do I use egen group? Or just gen?
Once I create the indicator, I will merge again with it to hopefully merge all the remaining cases from the using dataset.
Thank you very much for your help!
Code:
input merge1 year month day hour 1 2019 1 19 10 1 2019 1 19 15 1 2019 1 20 0 2 2019 1 20 0 1 2019 1 20 14 2 2019 1 20 14 1 2019 1 20 21 2 2019 1 20 21 1 2019 1 21 21 1 2019 1 23 7 1 2019 1 23 7 1 2019 1 23 13 1 2019 1 25 3 end