Dear Stata users,
I'm trying to build a long-term balanced panel data over three periods of 2013, 2016, and 2019 using Integrated Household Panel Dataset (LSMS data).
The problem is that households are split, so a new hhid is assigned every year,
and I don't know how to merge the problem of changing hhid because the first 4 digits are newly numbered every year in the unique identification ID consisting of 0000-000 7 digits.
For y2 (2013) -> y3 (2016), my idea is that I can merge based on the original HHID, but for y3 (2016) -> y4 (2109), how can I we merge it?

For example, in the case above, the y2_hhid of this household was 0005-0001, which was divided into two households in y3 (A: 0003-001, B:0003-003),
and there were three households in y4 (A:0002-001, C:0002-005, D:0002-006) from household A in y3, while B household was maintained (B:0003-003).
When trying to merge the household data of y3 and y4, if anyone can share their ideas on what criteria(using hhid or HHID or generating new variable) to do it?
It would be really helpful for future work. This data set is open source, so please refer to the attachment.
Best regards,
Suyeon Ro
I'm trying to build a long-term balanced panel data over three periods of 2013, 2016, and 2019 using Integrated Household Panel Dataset (LSMS data).
The problem is that households are split, so a new hhid is assigned every year,
and I don't know how to merge the problem of changing hhid because the first 4 digits are newly numbered every year in the unique identification ID consisting of 0000-000 7 digits.
For y2 (2013) -> y3 (2016), my idea is that I can merge based on the original HHID, but for y3 (2016) -> y4 (2109), how can I we merge it?
For example, in the case above, the y2_hhid of this household was 0005-0001, which was divided into two households in y3 (A: 0003-001, B:0003-003),
and there were three households in y4 (A:0002-001, C:0002-005, D:0002-006) from household A in y3, while B household was maintained (B:0003-003).
When trying to merge the household data of y3 and y4, if anyone can share their ideas on what criteria(using hhid or HHID or generating new variable) to do it?
It would be really helpful for future work. This data set is open source, so please refer to the attachment.
Best regards,
Suyeon Ro