Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Update data file

    I have a data file 1 with the columns id, a, b, c, d, e, f, g, h, i, j, k
    where id is the unique / primary key / identifier.

    I have received an update file 2 with the columns id, j, k, m, n
    i.e. column id is the unique identifier
    columns j and k have updated values,
    while m and n are new columns.
    The rows (i.e. set of id's) in file 2 are a subset of those in file 1

    I understand that to add new columns m and n to file 1, I can use the `merge` command.
    What's throwing me off is the need to simultaneously update values in columns j and k
    Thank you for your help!

    Stata SE/17.0, Windows 10 Enterprise

  • #2
    From https://www.stata.com/manuals/dmerge.pdf#dmerge we see:

    Replace missing and conflicting data in mydata1.dta with values from mydata2.dta

    merge 1:1 v1 v2 using mydata2, update replace

    Comment


    • #3
      Note that the same information is available in the output of help merge which should be the first place you look for questions like this, since you know the merge command does at least part of what you want.

      Code:
          update and replace both perform an update merge rather than a standard merge.  In a standard
              merge, the data in the master are the authority and inviolable.  For example, if the master
              and using datasets both contain a variable age, then matched observations will contain values
              from the master dataset, while unmatched observations will contain values from their
              respective datasets.
      
              If update is specified, then matched observations will update missing values from the master
              dataset with values from the using dataset.  Nonmissing values in the master dataset will be
              unchanged.
      
              If replace is specified, then matched observations will contain values from the using dataset,
              unless the value in the using dataset is missing.
      
              Specifying either update or replace affects the meanings of the match codes. See Treatment of
              overlapping variables in [D] merge for details.

      Comment

      Working...
      X