Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping data to attach children's information to mother's row

    Hey, I am working on US micro census data. I have information on households. Within households i have information about head of household, his/her spouse and children, sex and age. Data for every member in the household is given in a separate row. For my work, I need to attach the age and sex of children to their mother's row. I have tried using the reshape command in stata, however it has not worked for me as I don't have a unique id for every individual.

    Can anyone please guide me on how to add child's information in the mother's row? I would be really grateful! thanks in advance

  • #2
    Please read and act on http://www.statalist.org/forums/help#stata

    You're asking for help without giving an example of your dataset with variable names and without being explicit about data structure. You also mention a command that didn't work without even saying exactly what you typed.

    There's good reason why we ask everyone to read the FAQ Advice before posting.

    Comment


    • #3
      Essentially, you are asking about transforming the children's observations from a "long" layout (one observation per child) to a "wide" layout (one observation per mother). You could accomplish this by creating two datasets: one with mothers observations and a second with children's observations; then applying the reshape command to the childrens dataset to transform it from a "long" layout to a "wide" layout, and finally using the merge 1:1 command to combine the two children and mothers data.

      I would argue that doing so would be a mistake. The experienced users here generally agree that, with few exceptions, Stata makes it much more straightforward to accomplish complex analyses using a long layout of your data rather than a wide layout of the same data.

      Once you attach the children's data to the mother's data in your dataset, creating multiple sets of new variables (gender1 gender2 ... age1 age2 ... etc.), you are then likely to want to summarize these characteristics (nchild, nmale, nfemale, ageoldest, ...).

      It will be easier and more natural to create these summary variables using the childrens file in "long" layout, creating a file with one observation of summary variables per mother, and then merge this to the mother's data.

      Comment


      • #4
        Dear Mr. Cox, I apologize for my vague question.
        I am using a 5% sample from IPUMS (US) dataset. I have the following variables:

        serial: household serial number
        nfams: no. of families in household
        nsubfam: no. of subfamilies in household
        nmothers: no. of mothers in household
        pernum: person number in sample unit
        nchild: no. of own children
        famunit: family unit membership
        eldch: age of eldest child
        yngch: age of youngest child
        momloc: mother's location in the HH
        relate: relationship to HH (general)
        related: relationship to HH (detailed)
        sex
        age
        marst: marital status

        I am evaluating the impact of having children on women employment. For this purpose I need to assign all the children's gender to mothers. I have data as follows:
        serial pernum famsize nchild eldch yngch momloc relate related sex age marst
        2 1 5 3 20 14 0 head/hou head/hou male 44 married,
        2 2 5 3 20 14 0 spouse spouse female 41 married,
        2 3 5 0 childr n/a n/a 2 child child female 20 never ma
        2 4 5 0 childr n/a n/a 2 child child male 19 never ma
        2 5 5 0 childr n/a n/a 2 child child male 14 never ma
        In the sample above the woman has 3 children. I need the sex of each of these attached to the mother's data.

        The commands I used were:
        reshape wide sex, i(serial) j(age)
        reshape wide sex, i(serial) j(relate)
        reshape wide sex, i(serial) j(birthyr)
        The error I get is this one:
        values of variable birthyr not unique within serial
        Your data are currently long. You are performing a reshape wide. You specified i(serial) and
        j(birthyr). There are observations within i(serial) with the same value of j(birthyr). In the
        long data, variables i() and j() together must uniquely identify the observations.

        I even tried separating women and children into 2 separate datasets but this only works if the woman has 1 child only.

        I need to assign the sex of the 1st and 2nd children born to their mother. and if there is a subfamily in the HH, then the sex of the grandchild needs to be assigned to their respective mother.
        Attached Files

        Comment


        • #5
          Stata has told you what the problem is with your attempt at reshape. For the children alone, you want something like i(serial) j(pernum).

          I had hoped to provide an sample of not reshaping the data, using your sample data. But your sample data is very far from being ready for analysis. It appears that almost all your data are string variables. This needs to be corrected before anything useful can be accomplished.
          Code:
          . ds, has(type string)
          gq         nchild     stepmom    chborn     ancestr1d  schltype   uhrswork
          nfams      nchlt5     relate     race       ancestr2   empstat    yrlastwk
          nsubfam    famunit    related    raced      ancestr2d  empstatd   wrklstwk
          ncouples   eldch      sex        hispan     school     labforce   looking
          nmothers   yngch      age        hispand    educ       classwkr   availble
          famsize    nsibs      marst      ancestr1   educd      classwkrd
          Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post, looking especially at sections 9-12 on how to best pose your question. It would be particularly helpful to post a small hand-made example, perhaps with just a few variables and observations, showing the data before the process and how you expect it to look after the process, using the data editor to manually fill in the values you want Stata to compute for you. In particular, please read FAQ #12 and use dataex and CODE delimiters when posting to Statalist.

          Comment


          • #6
            Thanks a lot Mr. William Lisowski, this worked perfectly and i was able to reshape my data!

            Comment

            Working...
            X