Partner's information in household dataset

Anna Cusimano

Join Date: Jul 2024

Posts: 10
#1

Partner's information in household dataset

20 Dec 2024, 04:02

Hi everyone,
I am working on with a survey in which all adult members of the household participated individually. Now I want to generate a variable that captures the partner’s information (country of birth).
The dataset contains two identifiers: the personal identifier (pid) and the partner’s identifier (parid). The two correspond, so it is possible to link couples. There is also a household identifier (hid). In the example below, we have a couple living in the same household, with a German-born and a foreign-born partner (germborn).

Code:

* Example generated by -dataex-. For more info, type help dataex clear input long(pid parid hid) byte germborn 101 102 19 1 101 102 19 1 101 102 19 1 101 102 19 1 101 102 19 1 101 102 19 1 102 101 19 0 102 101 19 0 102 101 19 0 102 101 19 0 102 101 19 0 102 101 19 0

When parid is missing, it means that the person does not have a partner.
I am not sure how I can generate a variable to capture the partner’s origin.
Thank you!
Tags: None

Nick Cox

Join Date: Mar 2014
Posts: 34925

20 Dec 2024, 04:15

rangestat from SSC can help here. You can search the forum for mentions.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input long(pid parid hid) byte germborn
101    102    19 1
101    102    19 1
101    102    19 1
101    102    19 1
101    102    19 1
101    102    19 1
102    101    19 0
102    101    19 0
102    101    19 0
102    101    19 0
102    101    19 0
102    101    19 0
end 

rangestat germpartner=germborn, int(pid parid parid) by(hid)

list, sepby(hid pid)

     +-----------------------------------------+
     | pid   parid   hid   germborn   germpa~r |
     |-----------------------------------------|
  1. | 101     102    19          1          0 |
  2. | 101     102    19          1          0 |
  3. | 101     102    19          1          0 |
  4. | 101     102    19          1          0 |
  5. | 101     102    19          1          0 |
  6. | 101     102    19          1          0 |
     |-----------------------------------------|
  7. | 102     101    19          0          1 |
  8. | 102     101    19          0          1 |
  9. | 102     101    19          0          1 |
 10. | 102     101    19          0          1 |
 11. | 102     101    19          0          1 |
 12. | 102     101    19          0          1 |
     +-----------------------------------------+

The by(hid) option may be redundant, or even a nuisance if partners can be recorded as living in different households.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 29587
#3

20 Dec 2024, 08:51

The advice given by Nick in #2 is excellent, and is also how I would approach this problem. But, if you are in a situation where you can't or won't install user-written commands, there is another way to do this using only native Stata commands:

Code:

frame put hid pid parid germborn, into(partners) frame partners: duplicates drop frlink m:1 hid parid, frame(partners hid pid) frget germborn, from(partners) prefix(partner_) drop partners frame drop partners

This approach is worth knowing about in any case because there are cross-referencing situations like this that -rangestat- cannot handle, such as when the id variables are non-numeric or when the identification of the partner depends on multiple variables, not just one.

Nick's remark about the role of hid applies equally here.
Comment
Anna Cusimano

Join Date: Jul 2024

Posts: 10
#4

20 Dec 2024, 12:17

I tried the first suggestion and it worked. Now I will check the second. Thank you so much for your quick and helpful responses!
Comment

Announcement

Partner's information in household dataset

Comment

Comment

Comment