Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Grouping Households in Panel Data Based on Identifiers

    Hi All,

    I am trying to follow households over time in a panel dataset. Currently, each individual has a unique ID, and households can be identified in a given year if each observation has the same "intnumber" within that given year. The family ID means that the family units represented come from the same family tree, but not necessarily that they are in the same household.

    I would like to create a variable that uniquely identifies each household over time. In the below example, there are two separate households that come from the same famid #4. The two households can be seen containing members (shown by ID#) of (4003, 4031, 4173) and (4006, 4032, and 4170). What I would like is, for example, the household containing members 4003, 4031, and 4173 to be household #1 and then household containing members 4006, 4032, and 4170 to be household #2, as shown by a new variable.

    Please let me know if this was explained in a clear way. Many thanks in advance!


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float ID int year float(famid intnumber Head Spouse Child)
    4003 1986 4   595 1 . .
    4031 1986 4   595 . . 1
    4173 1986 4   595 . 1 .
    4003 1985 4   645 1 . .
    4031 1985 4   645 . . 1
    4173 1985 4   645 . 1 .
    4003 1990 4   983 1 . .
    4031 1990 4   983 . . 1
    4173 1990 4   983 . 1 .
    4003 1988 4  1081 1 . .
    4031 1988 4  1081 . . 1
    4173 1988 4  1081 . 1 .
    4006 1988 4  1216 . 1 .
    4032 1988 4  1216 . . 1
    4170 1988 4  1216 1 . .
    4003 1989 4  1225 1 . .
    4031 1989 4  1225 . . 1
    4173 1989 4  1225 . 1 .
    4006 1986 4  1310 . 1 .
    4032 1986 4  1310 . . 1
    4170 1986 4  1310 1 . .
    4006 1985 4  1344 . 1 .
    4032 1985 4  1344 . . 1
    4170 1985 4  1344 1 . .
    4006 1990 4  1757 . 1 .
    4032 1990 4  1757 . . 1
    4170 1990 4  1757 1 . .
    4003 1984 4  2332 1 . .
    4006 1987 4  2332 . 1 .
    4031 1984 4  2332 . . 1
    4032 1987 4  2332 . . 1
    4170 1987 4  2332 1 . .
    4173 1984 4  2332 . 1 .
    4003 1987 4  2435 1 . .
    4031 1987 4  2435 . . 1
    4173 1987 4  2435 . 1 .
    4003 1992 4  2482 1 . .
    4031 1992 4  2482 . . 1
    4173 1992 4  2482 . 1 .
    4006 1984 4  3142 1 . .
    4032 1984 4  3142 . . 1
    4006 1989 4  3235 . 1 .
    4032 1989 4  3235 . . 1
    4170 1989 4  3235 1 . .
    4003 1991 4  4379 1 . .
    4031 1991 4  4379 . . 1
    4173 1991 4  4379 . 1 .
    4003 1993 4  4729 1 . .
    4031 1993 4  4729 . . 1
    4173 1993 4  4729 . 1 .
    4003 1994 4 11733 1 . .
    4031 1994 4 11733 . . 1
    4173 1994 4 11733 . 1 .
    end

    Last edited by Cora Touchstone; 20 Jan 2022, 15:21.

  • #2
    Perhaps I'm missing something obvious, but I do not see how you reached the conclusion that 4003, 4031, 4173 form one household and 4006, 4032, and 4170 another one. What is it about their data that leads you to that?

    Putting it another way, if you had to do this by hand, how would you do it? What variables would you look at, and what about them would guide your identification of households?

    Comment


    • #3
      Hi Clyde,

      This is my thinking: for example, I believe 4003, 4031, and 4173 are in the same household because they have the same famid (family ID) and the same intnumber (Interview Number) within the same year. So in the year 1986, (4003, 4031, and 4173) all participated in the same household survey, as shown by intnumber 595. The intnumber changes each year, because it is assigned in order that the surveys from all families are received.

      4006, 4032, and 4170, while having the same famid as 4003, 4031, and 4173, all have a different intnumber in 1986, which is 1310.

      Comment


      • #4
        Your data is organized very much like the US Panel Study of Income Dynamics (PSID).

        You may benefit from this earlier discussion on Statalist of the futility of trying to create a household identifier that persists across time. The discussion is in the setting of the PSID, but the lessons are not specific to the PSID.

        https://www.statalist.org/forums/for...a-household-id

        Comment


        • #5
          OK. But I can foresee complications with this. There could be situations where a Child in one household becomes independent and moves away, perhaps even marrying and having a child of his/her own by the time of the next interview. It could get even more complicated: consider the situation from the last sentence where the spouse is from yet another household in the survey. I'm not sure what the best way to handle situations like that is: households are fluid and their composition can change over a timespan such as what you show.

          Added: Crossed with #4

          Comment

          Working...
          X