Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Partner variable based on a relational column in Stata? (very simple, ashamed to ask)

    Hey there,

    A straightforward question, I can't believe I spend so much time figuring this out.

    This is the data

    clear
    input long(b1_01_personid b1_02_householdid) byte(l1_00_9_personnumber l1_05_3_spousenumber)
    7433071 15862 1 2
    7433072 15862 2 1
    7433073 15862 3 -1
    7434328 15865 3 -1
    7434329 15865 1 2
    7434330 15865 2 1
    7434331 15865 4 -1
    7432840 41911 1 -1
    7432841 41911 2 -1
    7432842 41911 3 -1
    7433365 41912 2 3
    7433366 41912 1 -1
    7433367 41912 3 2
    7434172 41914 3 -1
    7434173 41914 1 2
    7434174 41914 4 -1
    7434175 41914 2 1
    7435341 41920 3 -1
    7435342 41920 6 5
    7435343 41920 1 -1
    7435344 41920 5 6
    7435345 41920 2 -1
    7435346 41920 4 -1


    b1_01_personid - person id
    b1_02_householdid - household id (persons are inside households)
    l1_00_9_personnumber - the person's number within a household (starts with 1 for all households)
    1_05_3_spousenumber - the person with this number is a spouse


    Take, for example, household 15862
    b1_01_personid b1_02_householdid l1_00_9_personnumber l1_05_3_spousenumber
    7433071 15862 1 2
    7433072 15862 2 1
    7433073 15862 3 -1
    it has 3 individuals, And 7433071 is a spouse of 7433072 and verse versa. So I need to construct part_id
    b1_01_personid b1_02_householdid l1_00_9_personnumber l1_05_3_spousenumber part_id
    7433071 15862 1 2 7433072
    7433072 15862 2 1 7433071
    7433073 15862 3 -1 .
    Last edited by Sergey Alexeev; 17 Jan 2022, 00:09.
    Kind regards,
    Sergey Alexeev | ​The University of Sydney
    https://alexeev.pw/

  • #2
    Code:
    isid b1_02_householdid l1_00_9_personnumber
    mvdecode l1_05_3_spousenumber, mv(-1)
    
    frame copy default spouses
    frame spouses: drop if missing(l1_05_3_spousenumber)
    frlink 1:1 b1_02_household l1_00_9_personnumber, ///
        frame(spouses b1_02_household l1_05_3_spousenumber)
    frget part_id = b1_01_personid, from(spouses)
    Notes:

    1. Requires version 16 or later.
    2. The data has l1_05_3_spousenumber coded as -1 for those who have no spouse. This is not a good way to handle this in Stata.The code above replaces those -1's with Stata's system missing value. This change not only simplifies the code shown here, but will almost surely make your life easier in other things you do with this data.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Code:
      isid b1_02_householdid l1_00_9_personnumber
      mvdecode l1_05_3_spousenumber, mv(-1)
      
      frame copy default spouses
      frame spouses: drop if missing(l1_05_3_spousenumber)
      frlink 1:1 b1_02_household l1_00_9_personnumber, ///
      frame(spouses b1_02_household l1_05_3_spousenumber)
      frget part_id = b1_01_personid, from(spouses)
      Notes:

      1. Requires version 16 or later.
      2. The data has l1_05_3_spousenumber coded as -1 for those who have no spouse. This is not a good way to handle this in Stata.The code above replaces those -1's with Stata's system missing value. This change not only simplifies the code shown here, but will almost surely make your life easier in other things you do with this data.
      Dear Professor Clyde Schechter,

      Thank you for your prompt reply. Let me try to implement your solution and come back to you.
      Have a great day.

      Regards,
      Sergey
      Kind regards,
      Sergey Alexeev | ​The University of Sydney
      https://alexeev.pw/

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        Code:
        isid b1_02_householdid l1_00_9_personnumber
        mvdecode l1_05_3_spousenumber, mv(-1)
        
        frame copy default spouses
        frame spouses: drop if missing(l1_05_3_spousenumber)
        frlink 1:1 b1_02_household l1_00_9_personnumber, ///
        frame(spouses b1_02_household l1_05_3_spousenumber)
        frget part_id = b1_01_personid, from(spouses)
        Notes:

        1. Requires version 16 or later.
        2. The data has l1_05_3_spousenumber coded as -1 for those who have no spouse. This is not a good way to handle this in Stata.The code above replaces those -1's with Stata's system missing value. This change not only simplifies the code shown here, but will almost surely make your life easier in other things you do with this data.
        The code is valid. Thank you. I am very glad I posted a question here. I now know "frame." I used to merge datasets as an ancient man.
        Kind regards,
        Sergey Alexeev | ​The University of Sydney
        https://alexeev.pw/

        Comment


        • #5
          Dear Professor Clyde Schechter,


          I now have a different variation of the same problem. I was hoping you could demonstrate another superior coding approach. Apologies in advance if I am abusing your time.

          Consider this data

          b1_01_personid b1_02_householdid l1_01_1_affinitivetype
          11386628 1 1
          11386629 1 2
          11386630 1 3
          11389061 4 1
          11389062 4 2
          11389063 4 3
          11389064 4 3
          11389401 5 1
          11389402 5 3
          11389403 5 3
          11389404 5 4
          11394017 16 1
          11394467 17 1
          11647754 19 1
          11647755 19 9
          11647756 19 3
          11648369 20 1
          11648370 20 2
          11648371 20 3
          11396156 21 1
          11396157 21 3
          11396262 22 1
          11396263 22 3
          11397532 23 1
          11397533 23 3
          11397534 23 3
          11648390 24 1
          11648391 24 2
          11648392 24 10

          Here
          b1_01_personid - person id
          b1_02_householdid - household id (persons are inside households)
          l1_01_1_affinitivetype - is the relationship towards the first interviewed person in the household (towards value "1"). Here value "2" is the partner.

          For example, this is household 1, where 11386628 is the head, 11386629 is the partner.
          b1_01_personid b1_02_householdid l1_01_1_affinitivetype
          11386628 1 1
          11386629 1 2
          11386630 1 3
          I again seek to recover the partner's id. In the example, I seek to construct variable Spo_id
          b1_01_personid b1_02_householdid l1_01_1_affinitivetype Spo_id
          11386628 1 1 11386629
          11386629 1 2 .
          11386630 1 3 .
          Kind regards,
          Sergey Alexeev | ​The University of Sydney
          https://alexeev.pw/

          Comment


          • #6
            Code:
            by b1_02_householdid (b1_01_personid), sort: egen long spo_id = ///
                max(cond(l1_01_1_affinitivetype == 2, b1_01_personid, .))
            replace spo_id = . if l1_01_1_affinitivetype != 1
            Notes: The code assumes, but does not verify, that within any household there is only one observation with l1_01_1_affinitivetype == 2. If there is more than one such, the one with the larger personid is selected. The code also assumes that the b1_01_personid variable is stored as a numeric variable (which should be either long or double given the precision required to hold all its digits). This can neither be verified nor rebutted because the tableau of data does not give metadata. To avoid such uncertainties, in the future please use the -dataex- command to show example data. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
            Last edited by Clyde Schechter; 18 Jan 2022, 11:04.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              Code:
              by b1_02_householdid (b1_01_personid), sort: egen long spo_id = ///
              max(cond(l1_01_1_affinitivetype == 2, b1_01_personid, .))
              replace spo_id = . if l1_01_1_affinitivetype != 1
              Notes: The code assumes, but does not verify, that within any household there is only one observation with l1_01_1_affinitivetype == 2. If there is more than one such, the one with the larger personid is selected. The code also assumes that the b1_01_personid variable is stored as a numeric variable (which should be either long or double given the precision required to hold all its digits). This can neither be verified nor rebutted because the tableau of data does not give metadata. To avoid such uncertainties, in the future please use the -dataex- command to show example data. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
              Dear Professor Clyde Schechter,

              Thank you. I confirm that the code performs as expected.

              PS. I thought that I used dataex in my post. I'll try to do better next time. My apologies. Thank you again.
              Kind regards,
              Sergey Alexeev | ​The University of Sydney
              https://alexeev.pw/

              Comment


              • #8
                PS. I thought that I used dataex in my post. I'll try to do better next time.
                You probably did, and just did the copy/paste incorrectly. When you run -dataex-, the first line of output is:
                Code:
                ---------------------- copy starting from the next line -----------------------
                and the penultimate line is:
                Code:
                ------------------ copy up to and including the previous line ------------------
                You need to copy everything between those two (but not including those two) and then just paste it into the Forum editor. It looks to me like you ran -dataex- but then didn't completely copy the necessary part of the output. It wasn't a big deal--I was able to use what you posted anyway. But sometimes if you don't get it all, the results are unusable.

                Comment

                Working...
                X