Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Duplicate observations from group to individual level

    Dear all

    I am quite new to stata and currently working with a dataset abouth health. This data comes from a survey where you have 2 sort of variables. First you have an individual variable, this different for every person The other sort is a household variable. Here there is a result for every household, filled in by one person for this household (the reference person). I am struggling with the following;

    Some variables are individual variables and are differerent for each respondent, other variables are household variables and only one person (the reference person in a household) has answered these. Now my problem is that the household variables contain less observations than the individual ones. The goal is to reshape the household variables into an individual variable. I thought about doing this by copying the household observation for the reference person to all other persons in that household. I was thinking about working with a foreach and to run through all households (via hhid (household id) and then duplicating if person is reference person (piid==1) and this the amount of times as the household is big (hhsize (household size). Unfortunately I am unable to figure out how to do this. An example of a household variable is V400908_2 which indicates how easy it is to reach a GP.

    hhid=household id
    hhsize=householdsize
    piid= position in household (=1 for reference person)

    Thanks for the help

  • #2
    I cannot tell from your description if you have

    A) two datasets, one with observations for every individual containing individual data, and a second with observations for the reference individual containing household data,

    or

    B) one dataset with observations for every individual containing both individual data (for every individual) and household data (for the reference individuals, with missing values for all other individuals.

    If it is (B) you will want code something like the following.
    Code:
    sort hhid pid
    by hhid (pid):  replace V400908_2 = V400908_2[1] if _n>1
    with one command for each household variable. This will within each household copy the value of V400908_2 from the first observation (which will be the reference person) to all the other observations.

    If it is (A) you will merge the two datasets.

    Comment


    • #3
      Thank you very much. It was option B indeed, this approach is way more simple then what I was trying.

      Comment

      Working...
      X