Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 1:m merge does not work on my dataset, r(459) repeatedly

    I use Stata 18 on a windows system.

    Dear all,
    for my thesis I'd like to merge two datasets, household.dta and kid.dta. They are both in long format and each contain the same time frame of 5 years.
    kid.dta contains information about the year, the householdID, the number of kids in the household and 2 dummy variables specifying age groups. This information has to be matched to two (or more) observations in household.dta (e.g. Mum & Dad) by using year and householdID as identifiers. I used merge 1:m hid syear using kid.dta and m:1 hid syear using kid.dta, alternatively. Both yield the result "variables hid year do not uniquely identify observations in the master data" (household.dta being the master).

    I already dropped duplicates in the kid.dta set but see myself unable to do the same for the household.dta, since these duplicates make up my households and therefore a big chunk of my analysis.

    Would appreciate any help!

  • #2
    -merge m:1- should be fine for your data, with "household" as the "master" data and "kid" as the "using" data. The error message you cite would be expected from a 1:m match with household as the master. Are you *sure* that you received this error message as described? "Master" here means the data set resident in the current data frame. If some confusion about that is not the problem, I'd suspect there is something about your data set and its structure that you don't know about. One potentially confusing situation is where there are missing values for your key variables, i.e., hid and year in this case.


    If the preceding doesn't lead you to a clarification, I'd suggest you post a data example using the -dataex- command, as described in the StataList FAQ (http://www.statalist.org/forums/help) for new members. You'd want to show a data example for your master and your using data, and you'd want the example data to be one that produces the problem you describe.

    Comment


    • #3
      Thanks Mike for the swift response! I have actually been able to make it work!

      Comment

      Working...
      X