Hi! As a novice user for Stata, I am so glad that I found this forum, and I sincerely appreciate any input or help in advance. I am using a longitudinal (panel) data of 12 waves. I am trying to randomly choose and use the 12 waves of data of only one member of the household for my longitudinal analyses, even though there could be one or more people from the same household included the survey design. I was able to randomly select only one member of the household by using this syntax (as follows) and got the results as formatted as below.
<SYNTAX>
set seed 12345
gen random = uniform()
bysort hhid (random) : gen byte select = _n == 1
sort hhidpn wave
**hhidpn is an unique id for participants,hhid is household id, pn is person number for your information**
<RESULTS AS simplified examples>
hhidpn wave hhid pn select
3010 1 3 10 0
3010 2 3 10 0
3010 3 3 10 0
3010 4 3 10 1
3010 5 3 10 0
3010 6 3 10 0
3010 7 3 10 0
3010 8 3 10 0
3010 9 3 10 0
3010 10 3 10 0
3010 11 3 10 0
3010 12 3 10 0
3020 1 3 20 0
3020 2 3 20 0
3020 3 3 20 0
3020 4 3 20 0
3020 5 3 20 0
3020 6 3 20 0
3020 7 3 20 0
3020 8 3 20 0
3020 9 3 20 0
3020 10 3 20 0
3020 11 3 20 0
3020 12 3 20 0
So, in this simplified example, even though both 3010 and 3020 (hhidpn) are from a same household of 3(hhid), hhidpn 3010 has been only selected ("select =1") and I would like to use 3010's all variables collected from "12 waves" for my analyses.
In this case, how can I keep and use all "12 waves of variables" only from the randomly selected hhidpn (such as 3010) in my longitudinal set of data?
It could be maybe simple one, but I am actually confused even after looking up previous posts in the forum. Any advice might be appreciated!
Comment