Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging two datasets with a common variable

    Hello all,
    I am using STATA 16
    Please I want to merge two datasets, using the common variable medcode, but i cannot seem to troubleshoot it.
    I have sorted medcode in both files.
    Thanks for the assistance.

    Please find datasets below

    A.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str10 patid str5 pracid int dob str17 medcode long(eventdate gender)
    "075G" "a9906" 19734 "12K7.00"  100 1
    "08me" "d9868"  8042 "12K7.00" 4504 1
    "06zE" "g7683" 18274 "12K7.00" 4545 2
    "05dl" "g6638" 18639 "12K7.00" 1204 1
    "06d6" "c7781" 13886 "12K7.00" 1061 1
    "02k0" "c9877" 17542 "12K7.00" 2530 2
    "05eS" "f9882" 17174 "12K7.00" 2345 1
    "06oH" "h9876" 20090 "12K7.00" 3640 1
    "022e" "f7687" 19725 "12K7.00"  217 2
    "05yA" "e6886" 20096 "12K7.00" 5041 1
    end
    format %tdnn/dd/CCYY dob
    label values eventdate eventdate
    label def eventdate 100 "  1/7/2015", modify
    label def eventdate 217 "  2/6/2017", modify
    label def eventdate 1061 "  9/3/2017", modify
    label def eventdate 1204 " 10/1/2017", modify
    label def eventdate 2345 "14/10/2011", modify
    label def eventdate 2530 "15/10/2018", modify
    label def eventdate 3640 "22/05/2017", modify
    label def eventdate 4504 "27/07/2017", modify
    label def eventdate 4545 "27/10/2016", modify
    label def eventdate 5041 "30/11/2016", modify
    label values gender gender
    label def gender 1 "Female", modify
    label def gender 2 "Male", modify

    B.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str17 medcode str60 description
    "0....00" "Occupations"                  
    "0....11" "Job"                          
    "01...00" "Top managers"                  
    "01...11" "Administrators - top"          
    "011..00" "National gov't. top managers"  
    "011Z.00" "Nat. gov't. general admin. NOS"
    "012..00" "Local gov't.- top managers"    
    "012Z.00" "Local gov't.- top manager NOS"
    "013..00" "Top managers - business"      
    "013..11" "Business - top men"            
    end

  • #2
    You do not show what command you ran and the resulting output. With "dataset A" open, the command should be something like

    Code:
    merge m:1 medcode using "dataset B"

    where you replace "dataset B" with the name of your second dataset.

    Comment


    • #3
      Apologies for leaving that out. I used similar command. Please find the command and result below.

      . merge m:1 medcode using read_code.dta (dataset B)
      (note: variable medcode was str7, now str17 to accommodate using data's
      values)
      variable medcode does not uniquely identify observations in the using data

      Thanks

      Comment


      • #4
        Check why you have duplicates of the variable medcode

        Code:
        use read_code.dta, clear
        bys medcode: list if _N>1
        If these duplicates are real duplicates and do not contain additional information in terms of other variables, then

        Code:
        duplicates drop medcode, force
        before executing the merge.

        Comment


        • #5
          Hi Andrew, worked perfectly !! Thanks a mil

          Comment

          Working...
          X