Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop survey families where parents are not in specific survey rounds

    Hi all,

    I would like to drop any family (as shown by their famid) when the "Head" and "Spouse" do not have data in the years 1984 and 1989. So, I would like to drop all given members with the same famid if there is no data for the Head and Spouse in those years. I would like to keep families when this is not the case. What is an efficient way to do this? I have tried using egen and _N, but am not sure how to specify just the years 1984 and 1989. Below is an example of a family I would like to drop, which includes a Head, Spouse, and three Children.

    Thank you!
    Cora

    Code:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float ID int year float(famid Head Spouse Child)
    6004 1985 6 . 1 .
    6004 1986 6 . 1 .
    6004 1993 6 . 1 .
    6030 1985 6 . . 1
    6030 1986 6 . . 1
    6030 1993 6 . . 1
    6031 1985 6 . . 1
    6031 1986 6 . . 1
    6031 1993 6 . . 1
    6170 1985 6 1 . .
    6170 1986 6 1 . .
    6170 1993 6 1 . .
    6171 1985 6 . . 1
    6171 1986 6 . . 1
    6171 1993 6 . . 1
    end

  • #2
    Code:
    g keepers = cond(Head==1 & Spouse==1 & (year==1984 | year==1989), 1, 0, .)
    And, in the future, when you provide data, make sure the conditions you are interested in are included.

    Comment


    • #3
      Your example data isn't very helpful because nobody in the example has any data at all for 1984 or 1988. I'm also uncertain how you want to handle a situation where a Head or Spouse, but not both, has data in those years. I'll assume you want to exclude them. I believe this works:

      Code:
      by famid, sort: egen has_84_and_88_data = ///
          min(inlist(year, 1984, 1988) & !missing(Head, Spouse))
          
      keep if has_84_and_88_data

      Comment


      • #4
        Thank you for the quick response!

        I assume I should drop any observations when "keepers" equals 0? The only problem is, keepers is "0" for this family example below (which I would like to keep).

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float ID int year float(famid Head Spouse Child keepers)
        1959001 1984 1959 1 . . 0
        1959001 1985 1959 1 . . 0
        1959001 1986 1959 1 . . 0
        1959001 1987 1959 1 . . 0
        1959001 1988 1959 1 . . 0
        1959001 1989 1959 1 . . 0
        1959002 1984 1959 . 1 . 0
        1959002 1985 1959 . 1 . 0
        1959002 1986 1959 . 1 . 0
        1959002 1987 1959 . 1 . 0
        1959002 1988 1959 . 1 . 0
        1959002 1989 1959 . 1 . 0
        1959005 1984 1959 . . 1 0
        1959005 1985 1959 . . 1 0
        1959005 1986 1959 . . 1 0
        1959005 1987 1959 . . 1 0
        1959005 1988 1959 . . 1 0
        1959005 1989 1959 . . 1 0
        end

        Is there anything else to be done?

        Thanks again!
        Cora

        Comment


        • #5
          Hi Clyde Schechter,

          Thanks for the reply! I can provide an example of both below (I would like to keep famid==1951 and drop famid==6). Additionally, I would like to drop families where a Head or Spouse has data for 1984 and 1989, but not both. Thanks again!

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float ID int year float(famid Head Spouse Child)
             6004 1985    6 . 1 .
             6004 1986    6 . 1 .
             6004 1993    6 . 1 .
             6030 1985    6 . . 1
             6030 1986    6 . . 1
             6030 1993    6 . . 1
             6031 1985    6 . . 1
             6031 1986    6 . . 1
             6031 1993    6 . . 1
             6170 1985    6 1 . .
             6170 1986    6 1 . .
             6170 1993    6 1 . .
             6171 1985    6 . . 1
             6171 1986    6 . . 1
             6171 1993    6 . . 1
          1959001 1984 1959 1 . .
          1959001 1985 1959 1 . .
          1959001 1986 1959 1 . .
          1959001 1987 1959 1 . .
          1959001 1988 1959 1 . .
          1959001 1989 1959 1 . .
          1959002 1984 1959 . 1 .
          1959002 1985 1959 . 1 .
          1959002 1986 1959 . 1 .
          1959002 1987 1959 . 1 .
          1959002 1988 1959 . 1 .
          1959002 1989 1959 . 1 .
          1959005 1984 1959 . . 1
          1959005 1985 1959 . . 1
          1959005 1986 1959 . . 1
          1959005 1987 1959 . . 1
          1959005 1988 1959 . . 1
          1959005 1989 1959 . . 1
          end


          Comment


          • #6
            First, my response in #3 crossed with #2.

            #5 clears up my uncertainty about handling cases where Head or Spouse data, but not both, are present--in the opposite direction of what I had assumed. The following should handle all cases correctly:

            Code:
            by famid, sort: egen byte OK84 = max(cond(year == 1984, !missing(Head) | !missing(Spouse), 0))
            by famid: egen byte OK88 = max(cond(year -- 1988, !missing(Head) | !missing(Spouse), 0))
            keep if OK84 & OK88

            Comment


            • #7
              Clyde Schechter,

              This worked perfectly for me! Many thanks.

              Best,
              Cora

              Comment

              Working...
              X