How to randomly select one individual from a household dataset?

Jasmine Zhao

Join Date: Mar 2024

Posts: 3
#1

How to randomly select one individual from a household dataset?

06 Mar 2024, 00:45

I have a household dataset with the household ID, parents' IDs, and each child's ID. Now, I hope to select one child from each household randomly. What could I do?
The dataset is like this:

hhid pid_m pid_f pid_c1 pid_c2 pid_c3 pid_c4 pid_c5 pid_c6

And I want to select one pid_c* from each family randomly.
PS: some pid_c* are missing values when the family has few children. how could I realize this, taking the missing values into account?

Thank you!

Last edited by Jasmine Zhao; 06 Mar 2024, 01:14.
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4352
#2

06 Mar 2024, 02:29

Originally posted by Jasmine Zhao View Post

. . . I want to select one pid_c* from each family randomly . . . taking the missing values into account

Assuming that the first line of code below works, then you could do something like the following.

Code:

reshape long pid_c, i(hhid) j(seq) drop if missing(pid_c) generate double randu = runiform() isid hhid randu, sort by hhid: keep if _n == 1

Otherwise, you could modify the code to include pid_m is part of a compound primary key with hhid as in the following.

Code:

reshape long pid_c, i(hhid pid_m pid_f) j(seq) drop if missing(pid_c) generate double randu = runiform() isid hhid randu, sort by hhid: keep if _n == 1
Comment
Jasmine Zhao

Join Date: Mar 2024

Posts: 3
#3

06 Mar 2024, 02:52

Originally posted by Joseph Coveney View Post

Assuming that the first line of code below works, then you could do something like the following.

Code:

reshape long pid_c, i(hhid) j(seq) drop if missing(pid_c) generate double randu = runiform() isid hhid randu, sort by hhid: keep if _n == 1

Otherwise, you could modify the code to include pid_m is part of a compound primary key with hhid as in the following.

Code:

reshape long pid_c, i(hhid pid_m pid_f) j(seq) drop if missing(pid_c) generate double randu = runiform() isid hhid randu, sort by hhid: keep if _n == 1

Good idea to resahpe the dataset!
Thank you! I'll try the code.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4352
#4

07 Mar 2024, 01:47

Originally posted by Jasmine Zhao View Post

Thank you! I'll try the code.

You're welcome. Don't forget to set the random-number generator seed, something that I neglected to include in the code snippets above.
Comment
Jasmine Zhao

Join Date: Mar 2024

Posts: 3
#5

12 Mar 2024, 10:57

Originally posted by Joseph Coveney View Post

You're welcome. Don't forget to set the random-number generator seed, something that I neglected to include in the code snippets above.

Sorry I just saw this message! Have set the seed!
Comment

Announcement

How to randomly select one individual from a household dataset?

Comment

Comment

Comment

Comment