Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Joinby with Frames

    Is there a way to use joinby with frames? I can of course manage without, but it would be elegant to be able to write something of the form:

    Code:
    frame create children
    frame change children
    import children.csv
    do child_clean.do
    
    frame create parents
    frame change parents
    import parents.csv 
    do parent_clean.do
    
    frame copy parents family
    frame change family
    frjoinby familyid, frame(children)
    frget age, frame(children)
    I can see that frget in this case would potentially be a little more complicated than frlink, but not prohibitively so.

  • #2
    -joinby- was written prior to frames, and therefore doesn't know about them. Your example could be modified easily enough though by using a tempfile. This would be an appropriate way to use frames and take advantage of joinby (or merge).

    (I'm sure you know this, but I mention it for others who may come across this thread and not be aware of the strategy.)

    Code:
    tempfile children
    frame create children
    frame change children
    import children.csv
    do child_clean.do
    save `children', replace
    
    frame create parents
    frame change parents
    import parents.csv
    do parent_clean.do
    
    frame copy parents family
    frame change family
    joinby familyid using using `children'
    That being said, frame link and get solves a different problem than -joinby-. The former is a somewhat limited form of -merge-, in that it only allows m:1 matching (for m≥1), it doesn't extend the current dataset with observations that exist only in the using dataset, and doesn't allow for updating values. In contrast, -joinby- is more flexible and allows for m:n joining by id, creating all m*n pairs, which in can exploited to accomplish the same thing that -merge- does, but faster.

    Comment


    • #3
      Thanks for this. What you suggest is indeed what I am doing at the moment, but I am sure many others will be grateful for the example as you suggest. Using frames has become a big part of my workflow as I find it facilitates merging data much easier. But, perhaps I am not mindful enough of their limitations and particularly those of frlink and frget.

      Comment


      • #4
        Leonardo Guizzetti Thank you for your example.

        I'm trying to learn how to best use frames to my advantage.
        Can you please explain the advantage of using frames in your example?
        How is what you did different than the code below, where I simply remove all frame commands and add the -clear- option to -import-?
        I understand that those datasets will remain in memory, in their respective frames, but how is that beneficial if -frlink- is not being used?

        Thank you in advance!
        Code:
        tempfile children
        import children.csv
        do child_clean.do
        save `children', replace
        
        import parents.csv, clear
        do parent_clean.do
        
        joinby familyid using using `children'

        Comment


        • #5
          There is no advantage if you only care about the final, merged dataset. Frames allow you to keep this data in memory for further use or inspection if that is important.

          Comment


          • #6
            Okay, thanks!

            Comment

            Working...
            X