Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • merge files; only using data on certain participants

    Hi there
    I am trying to merge two data sets, B into A.

    They both have the same number of original participants (n=8000), but
    A is my main data set, on which I've done a complete case analysis and dropped 2500 participants (n = 5500)
    B is a data set of just one variable, which has measurements on all 8000 participants.

    When I merge, I want to add in the data from dataset B, 1:1 only added to all my participants left in dataset A (ie, 5500); I do not want all 8000 participant's measured variable in database B to be added to database A

    The latter situation is what happens when I type the code:
    merge 1:1 participant_id using database_b.dta


    Does anybody know to only merge data for the participants I want?

    Thanks
    Al

  • #2
    after you have merged, Stata automatically creates a new indicator variable that tells you where each observation comes from; by default, this is called _merge -
    it sounds like you just want to drop those observations that come from "B" - if "B" is the using data set this will be:
    Code:
    drop if _merge == 2
    there are other ways to do this also; see the "keep" option in the help file

    Comment

    Working...
    X