Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Export extract, alternatives to preserve -> keep -> order -> export -> restore ?

    I often need to export extracts of the dataset in memory at the different stages of my workflow, e.g. first without regression results, then with regression results, before a transformation, after a transformation, etc.

    To this aim, I end up including code such as the following, where I rely on the same syntax using preserve -> keep -> order -> export -> restore

    Code:
    preserve
    keep year state iso ///
         var1 var1a var2a var3a ///
         x_a1 x_a2 x_a3 b_a1 b_a4
    keep if state =="NE" & var1!=.
    order year state iso ///
         var1 var1a var2a var3a ///
         x_a1 x_a2 x_a3 b_a1 b_a4
    export excel using "extract1.xlsx", firstrow(variables) replace
    restore
    Is there a better, more efficient and/or more elegant way to go about this?

  • #2
    What you've described seems like a reasonable approach.
    Do you have something particular in mind when you say "more efficient/or more elegant"?

    Comment


    • #3
      The Syntax section in the output of help export excel tells us
      Code:
       Save subset of variables in memory to an Excel file
      
              export excel [varlist] using filename [if] [in] [, export_excel_options]
      If when a varlist is specified the variables are output in the order given, then it seems to me you should be able to reproduce the results of the code in post #1 with
      Code:
      export excel ///
           year state iso ///
           var1 var1a var2a var3a ///
           x_a1 x_a2 x_a3 b_a1 b_a4 ///
           using "extract1.xlsx" ///
           if state =="NE" & var1!=. ///
           , firstrow(variables) replace

      Comment


      • #4
        If what Ms. DelPiero is describing is code that recurs often in her workflow, but in different contexts and different applications, then I don't see any material way to improve upon it.

        However, if what she means is that a single file of code has loops over year state and iso that break up a data set into separate worksheets, then it can be done like this:

        Code:
        capture program drop one_batch
        program define one_batch
            local y = year[1]
            local s = state[1]
            local i = iso[1]
            export excel using extract.xlsx, ///
                sheet(`y'_`s'_`i', replace) firstrow(variables)
            exit
         end
         
         keep year state iso var1 var1a var2a var3a x_a1 x_a2 x_a3 b_a1 b_a4
         keep if !missing(var1)
         runby one_batch, by(year state iso) status
        This will take an entire data set and split it up into segments defined by combinations of year, state, and iso, and export each segment into a separate worksheet within a single excel file. It will be much faster than doing the same thing with loops over year, state, and iso if the data set is large. In fact, it is possible that it will be so fast that the operating system's write buffers will overflow: it might be necessary to slow it down by putting a -sleep- command after the -export excel- command so the OS has time to catch up.

        This code requires the -runby- program, written by Robert Picard and me, available from SSC.

        Comment

        Working...
        X