Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to randomly select one individual from a household dataset?

    I have a household dataset with the household ID, parents' IDs, and each child's ID. Now, I hope to select one child from each household randomly. What could I do?
    The dataset is like this:

    hhid pid_m pid_f pid_c1 pid_c2 pid_c3 pid_c4 pid_c5 pid_c6

    And I want to select one pid_c* from each family randomly.
    PS: some pid_c* are missing values when the family has few children. how could I realize this, taking the missing values into account?

    Thank you!
    Last edited by Jasmine Zhao; 06 Mar 2024, 01:14.

  • #2
    Originally posted by Jasmine Zhao View Post
    . . . I want to select one pid_c* from each family randomly . . . taking the missing values into account
    Assuming that the first line of code below works, then you could do something like the following.
    Code:
    reshape long pid_c, i(hhid) j(seq)
    drop if missing(pid_c)
    generate double randu = runiform()
    isid hhid randu, sort
    by hhid: keep if _n == 1
    Otherwise, you could modify the code to include pid_m is part of a compound primary key with hhid as in the following.
    Code:
    reshape long pid_c, i(hhid pid_m pid_f) j(seq)
    drop if missing(pid_c)
    generate double randu = runiform()
    isid hhid randu, sort
    by hhid: keep if _n == 1

    Comment


    • #3
      Originally posted by Joseph Coveney View Post
      Assuming that the first line of code below works, then you could do something like the following.
      Code:
      reshape long pid_c, i(hhid) j(seq)
      drop if missing(pid_c)
      generate double randu = runiform()
      isid hhid randu, sort
      by hhid: keep if _n == 1
      Otherwise, you could modify the code to include pid_m is part of a compound primary key with hhid as in the following.
      Code:
      reshape long pid_c, i(hhid pid_m pid_f) j(seq)
      drop if missing(pid_c)
      generate double randu = runiform()
      isid hhid randu, sort
      by hhid: keep if _n == 1
      Good idea to resahpe the dataset!
      Thank you! I'll try the code.

      Comment


      • #4
        Originally posted by Jasmine Zhao View Post
        Thank you! I'll try the code.
        You're welcome. Don't forget to set the random-number generator seed, something that I neglected to include in the code snippets above.

        Comment


        • #5
          Originally posted by Joseph Coveney View Post
          You're welcome. Don't forget to set the random-number generator seed, something that I neglected to include in the code snippets above.
          Sorry I just saw this message! Have set the seed!

          Comment

          Working...
          X