I'm looking for assistance with duplicate data based on only two variables.
I started with duplicates list then duplicates drop to remove duplicates, but upon trying to reshape the longitudinal data from long form to wide form, STATA gave me an error message stating:
values of variable year not unique within id
Your data are currently long. You are performing a reshape wide. You specified i(id) and j(year). There are observations within
i(id) with the same value of j(year). In the long data, variables i() and j() together must uniquely identify the observations.
I ran reshape error, which provided a long list of duplicates based on only id and year, but I can't figure out how to remove these values. I'm working with a large dataset, so it won't show the whole list of these duplicates, so I couldn't individually drop them. I would also like to look at these duplicates to see where they differ in regard to the other variables since they were not removed with the original duplicate command.
Any suggestions?
I started with duplicates list then duplicates drop to remove duplicates, but upon trying to reshape the longitudinal data from long form to wide form, STATA gave me an error message stating:
values of variable year not unique within id
Your data are currently long. You are performing a reshape wide. You specified i(id) and j(year). There are observations within
i(id) with the same value of j(year). In the long data, variables i() and j() together must uniquely identify the observations.
I ran reshape error, which provided a long list of duplicates based on only id and year, but I can't figure out how to remove these values. I'm working with a large dataset, so it won't show the whole list of these duplicates, so I couldn't individually drop them. I would also like to look at these duplicates to see where they differ in regard to the other variables since they were not removed with the original duplicate command.
Any suggestions?
Comment