Hallo good people,
I have a data set from a survey. Sampling was done by clusters where each randomly sampled cluster had 10 households in it again randomly assigned. The problem is that whereas the clusters are uniquely identified/serialized from 1 to the last, the household are not. They are serialized 1-10 for each cluster such that in the dataset, there are as many households serialized 1 as there are clusters. (each cluster has a household serialized 1). It becomes a problem to uniquely identify households and even more difficult to merge different sections of the dataset. Can anyone give me a starter advice that I can build on to have a unique identifier of each hh?
Merci
I have a data set from a survey. Sampling was done by clusters where each randomly sampled cluster had 10 households in it again randomly assigned. The problem is that whereas the clusters are uniquely identified/serialized from 1 to the last, the household are not. They are serialized 1-10 for each cluster such that in the dataset, there are as many households serialized 1 as there are clusters. (each cluster has a household serialized 1). It becomes a problem to uniquely identify households and even more difficult to merge different sections of the dataset. Can anyone give me a starter advice that I can build on to have a unique identifier of each hh?
Merci
Comment