Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intersection among datasets

    Hi all,

    so I have k databases (55) called name_country_mole.dta. Each database is an unbalanced panel of molecules observed for a certain number of quarters (the number of quarters and the quarters themselves might vary from a database to another). My objective is to find the matching molecules and quarters (i.e. intersection) across those datasets. When I had only 3 or 4 countries what I did was simply a series reclink of the type:
    Code:
    reclink Molecule quarter using "france_mole.dta", gen(myscore) idm(id_mas) idu(id_us) minscore(1)
    so country1 reclinked with country2 obtaining the data countr1_2.dta. Then country_1_2.dta reclinked with country3 to obtain country_1_2_3.dta and so on. Now that I have 55 databases have to find a smarter way to perform such a task.
    Can anyone please help me? I was thinking about something involving tempfiles but I am new to these.

    Thank you,

    Federico

  • #2
    Is there a reason why you wouldn't consider -merge-?
    Code:
    cd <whatever>
    local file_list : dir "`c(pwd)'" files "*_mole.dta"
    
    local first 1
    
    tempfile accumulation
    
    foreach file of local file_list {
    
        use "`file'"
    
        if !`first' merge 1:1 Molecule quarter using `accumulation', keep(match) nogenerate noreport
        else local first 0
    
        quietly save `accumulation', replace
    
    }

    Comment


    • #3
      Joseph Coveney thank you for your reply.
      Is there a reason why you wouldn't consider -merge-?
      Actually because they are unbalanced panels and the way in which they are unbalanced is unknown to me (data collection issues). Therefore, I don't have the same number o observations in each dataset nor the same molecules. Also, I am not sure that merge keeps only the Molecules and quarters in common as reclink does.

      Comment


      • #4
        You said, "My objective is to find the matching molecules and quarters (i.e. intersection) across those datasets."
        Code:
        merge <cardinality> Molecules quarters using <datasets>, keep(match)
        is exactly how you do that.

        Comment


        • #5
          Joseph Coveney I did not know it. Thank you

          Comment

          Working...
          X