Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can Stata 16 frames be appended?

    Fellow Statalsters (especially StataCorp)

    Many thanks to StataCorp for Stata 16. The frames are a specially welcome addition, and I have been playing with these over the weekend so far. I definitely hope to frameify my resultsset-generating programs to make resultsframes.

    One immediate query. Is it possible to append frames (as we append datasets using the append commnd)? I would find such a possibility very useful in for the Stata 16 version of the parmby module of my parmest package. I have looked in the help for frames and for append, but have so far found nothing about appending frames. (Am I just not looking in the right place?)

    Best wishes

    Roger

  • #2
    PS I have checked out frame post, and this seems to be sufficient to solve my problem for most parmby resultssets most of the time, but it seems to append only 1 observation at a time, which might be either inelegant or inefficient or both for serious frame-appending jobs.

    Comment


    • #3
      Roger, there might well be a way to do this with -frame put- but the manual entry seems to suggest this is rather suited to accumulating observations following repetitive tasks, like a simulation. I thought perhaps the frame prefix could be used, but -append- expects saved datasets. Then, the first page of the manual chapter Combining Datasets (chapter 23) suggests that frames can be used to join datasets horizontally using linking, as might be natural for certain types of joins, but again, no mention of vertical appending operations.

      I believe the key to merging is to still save the frame datasets to disk, which enables the utility of -append- (and similar operations). Consider the toy example that creates two frames, each containing unique subsets of the auto.dta, and then a third dataset to combine them. Note that operations can be done on the frame without switching to it, but it does require saving each subset.

      Code:
      clear *
      cls
      
      tempname one two combined
      
      frame create `one'
      frame change `one'
      sysuse auto
      keep if foreign
      save `one'
      
      frame create `two'
      frame change `two'
      sysuse auto
      keep if !foreign
      save `two'
      
      frame create `combined'
      frame `combined': use `one'
      frame `combined': append using `two'
      frame change `combined'
      describe
      tab foreign
      I suppose that this is a workable solution that is more efficient than -frame put-, but I think it would be nice of the operations can work with the datasets in memory, otherwise this just feels to me like a clunkier way of using -append-.

      Comment


      • #4
        Thanks to Leonardo for suggesting append. As the point of frames is to avoid file i/o, I would prefer a frame append or something similar. (Perhaps ths will be the new top item on my personal wish list, now that we finally have the frames that I have longed for for so long..).)

        Comment


        • #5
          PS another possible constructive suggestion. The frame put command would be even better if it could have a varlist AND if/in qualifiers. As in:

          frame put foreign make mpg if mod(_n,2), into(newf2)

          which currently is invalid syntax. Whereas

          frame put foreign make mpg, into(newf2)

          and

          frame put if mod(_n,2), into(newf3)

          are both legal.


          Comment


          • #6
            Hi, here's a quick & dirty example (kind of ugly and needs work) that shows one way to go about it using Mata. It appends 3 variables for 22 rows to the destination frame.

            Code:
            sysuse auto, clear
            frame create xxxxx
            frame xxxxx: {
                sysuse auto, clear
                keep mpg weight price foreign make
            }
            
            gen byte touse=(foreign==1)
            
            mata:
            cwf=st_framecurrent()
            st_view( src1=.,.,"mpg foreign","touse")
            st_sview(src2=.,.,"make","touse")
            srcnobs=rows(src1)
            st_framecurrent("xxxxx")
            dstnobs=st_nobs()
            st_addobs(srcnobs)
            st_view (dst1=.,(dstnobs+1,dstnobs+srcnobs),"mpg foreign")
            st_sview(dst2=.,(dstnobs+1,dstnobs+srcnobs),"make")
            dst1[.,.]=src1
            dst2[.,.]=src2
            st_framecurrent(cwf)
            end
            
            noi count
            frame xxxxx: {
                noi count
                noi list
            }
            And here is an (also ugly) solution without Mata using frlink and frval():

            Code:
            sysuse auto, clear
            frame create xxxxx
            frame xxxxx: {
                sysuse auto,clear
                keep mpg weight price foreign make
            }
            gen byte touse=(foreign==1)
            bysort touse: gen byte id=_n
            count if touse
            local srcnobs=r(N)
            
            frame xxxxx: {
                local N=_N
                gen byte touse=0
                set obs `=_N+`srcnobs''
                replace touse=1 if mi(touse)
                bysort touse: gen byte id=_n
                frlink 1:1 touse id, frame(default)
                foreach v in mpg foreign make {
                    replace `v'=frval(default,`v') if touse
                }
                drop touse id default
            }
            
            noi count
            frame xxxxx: {
                noi count
                noi list
            }

            Comment


            • #7
              Here is a quick and dirty command to append data from one frame to the current frame. It makes a few assumptions:

              1) Variables in the two frames are named the same -- at least the ones in the frame from which you wish to append the data

              2) You do not wish to append any variables in the 'from' frame that do not already exist in the current frame.

              3) Variable types are compatible. It does no checking on them. That said, it uses replace, so will promote types in the current frame if the data being appended needs more precision.

              4) You want all observations from the 'from' frame. I didn't add anything for if or in.

              It does minimal error-checking. It just checks that the 'from' frame exists and that the specified variables exist in both the current frame and the 'from' frame.

              Code:
              program myfrappend
                  version 16
              
                  syntax varlist, from(string)
              
                  confirm frame `from'
              
                  foreach var of varlist `varlist' {
                      confirm var `var'
                      frame `from' : confirm var `var'
                  }
              
                  frame `from': local obstoadd = _N
              
                  local startn = _N+1
                  set obs `=_N+`obstoadd''
              
                  foreach var of varlist `varlist' {
                      replace `var' = _frval(`from',`var',_n-`startn'+1) in `startn'/L
                  }
              end
              Save the above into myfrappend.ado.

              Example (silly) usage:

              Code:
              sysuse auto
              frame create test
              frame test: sysuse auto
              frame test: keep in 1/10
              
              myfrappend _all, from(test)
              myfrappend mpg weight length, from(test)

              Comment


              • #8
                Many thanks to Brian and Alan for these basic solutions. Between them, they probably illustrate enough for me to be able to frameify the parmby module. For the component by-group resultssets concatenated by parmby, Alan's simplifying assumptions should apply.
                All the best
                Roger

                Comment


                • #9
                  Originally posted by Roger Newson View Post
                  PS another possible constructive suggestion. The frame put command would be even better if it could have a varlist AND if/in qualifiers. As in:

                  frame put foreign make mpg if mod(_n,2), into(newf2)

                  which currently is invalid syntax. Whereas

                  frame put foreign make mpg, into(newf2)

                  and

                  frame put if mod(_n,2), into(newf3)

                  are both legal.

                  The other big problem with frame put in this use case is that you can't put observations into an existing frame. So it works to subset your data (either by variable or observation) but not to combine datasets.

                  Comment


                  • #10
                    Append frames in Stata.

                    I have struggled to find a satisfying answer so I succumbed to tempflies:

                    Code:
                    cap program drop frame_merge
                    
                    program define frame_merge
                    version 16.1 syntax, using(string) confirm frame `using' //store starting frame local curframe = c(frame) //switch to user input frame cwf `using' tempfile mergethis //return to the year 1995 save `mergethis' cwf `curframe' append using `mergethis'
                    end
                    Example:

                    Code:
                    sysuse auto
                    frame create test
                    frame test:sysuse auto
                    frame test:keep in 1/10
                    
                    frame_merge, using(test)
                    You'll end up in the default frame with the two frames appended with the test frame at the end of the dataset.

                    Comment


                    • #11
                      I believe appending data frames is solved in Stata with the -frameappend- module.

                      https://ideas.repec.org/c/boc/bocode/s458685.html

                      https://econpapers.repec.org/softwar...de/s458685.htm

                      Comment


                      • #12
                        I have found that appending large data frames (>100.000 observations, >1.000 variables) with the existing user written commands (frameappend, xframeappend) is about 20 times slower than a simple append using ... command. I just wrote the command fframeappend (https://github.com/JuergenWiemers/fframeappend), which generalizes Brian Landy's Mata sketch from above (reply #6). My command is almost as fast as append (using an SSD (~500MB/s) for data storage). So if you append large data frames regularly, you might want to try fframeappend.
                        Last edited by Juergen Wiemers; 27 Mar 2022, 12:02.

                        Comment


                        • #13
                          Originally posted by Juergen Wiemers View Post
                          I have found that appending large data frames (>100.000 observations, >1.000 variables) with the existing user written commands (frameappend, xframeappend) is about 20 times slower than a simple append using ... command. I just wrote the command fframeappend (https://github.com/JuergenWiemers/fframeappend), which generalizes Brian Landy's Mata sketch from above (reply #6). My command is almost as fast as append (using an SSD (~500MB/s) for data storage). So if you append large data frames regularly, you might want to try fframeappend.
                          Thanks Juergen Wiemers for this great command! This is just to let you know that the command apparently doesn't tolerate appending observations to an empty dataset. The appending procedure seems succesfull but it issues a 198 error.
                          Code:
                          webuse lifeexp
                          frame create newframe
                          frame change newframe
                          fframeappend, using(default)
                          (If anyone wonders why I need to do so: I want to subsequently use the set maxvar command)

                          Comment

                          Working...
                          X