Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • advice on how to deal with merge of datasets when one dataset has multiple of the same id's

    Hi, I am very new to Stata and am trying to merge this data set with my master dataset. I would really appreciate advice on how to merge all the observations for one household with a new indicator variable which will just take the value 1 if any of the below shocks were experienced and 0 if none were. Thank you in advance for any help.

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str14 household_id int hh_s8q00 float shock
    "01010101601017" 116 0
    "01010101601017" 117 0
    "01010101601017" 118 0
    "01010101601034" 101 1
    "01010101601034" 102 1
    "01010101601034" 103 0
    "01010101601034" 104 0
    end
    label values hh_s8q00 hh_s8q00
    label def hh_s8q00 101 "Death of Household Member", modify
    label def hh_s8q00 102 "Illness of Household Member", modify
    label def hh_s8q00 103 "Loss of Non-farm Jobs of Household Member", modify
    label def hh_s8q00 104 "Drought", modify
    label def hh_s8q00 116 "Displacement (Due to Gov Dev Project)", modify
    label def hh_s8q00 117 "Local Unrest/Violence", modify
    label def hh_s8q00 118 "Other (Specify)", modify


  • #2
    Well, you don't show any information about your master data set. I'll just guess that it contains a single observation for each household, and that the households in it are identified by a variable with the same name, household_id, and that the two household_id variables are both 14 character strings.

    So first I would just reduce the data set you did show to a single observation per household with:
    Code:
    collapse (max) shock, by(household_id)
    The variable shock will now be coded 0 for no shocks experienced, 1 for any shock experienced.

    Then you can -merge 1:1 household id- with the master data set.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Well, you don't show any information about your master data set. I'll just guess that it contains a single observation for each household, and that the households in it are identified by a variable with the same name, household_id, and that the two household_id variables are both 14 character strings.

      So first I would just reduce the data set you did show to a single observation per household with:
      Code:
      collapse (max) shock, by(household_id)
      The variable shock will now be coded 0 for no shocks experienced, 1 for any shock experienced.

      Then you can -merge 1:1 household id- with the master data set.
      Thank you so much! This worked perfectly.

      Comment

      Working...
      X