Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Running multiple do-files simultaneously

    I am using Stata 14 MP (64-bit) and would like to run multiple do-files simultaneously. Each do-file refers to a specific dataset. I am aware of -parallel- by George Vega Yon and tried the following code proposed in this old post.

    Code:
    clear all
    set more off
    set trace off
    parallel setclusters 3
    program def myprogram
      if ($pll_instance == 1) do  "do_file1.do"
      else if ($pll_instance == 2) do "do_file2.do"
      else if ($pll_instance == 3) do "do_file3.do"
    end
    parallel, nodata prog(myprogram): myprogram
    However, I get this error message

    Code:
    invalid 'business'
                     stata():  3598  Stata returned error
    parallel_export_programs():     -  function returned error
         parallel_write_do():     -  function returned error
                     <istmt>:     -  function returned error
    r(3598);
    
    end of do-file
    
    r(3598);
    Your help would be much appreciated.
    Last edited by Giovanni Palmioli; 24 Jul 2017, 05:11.

  • #2
    You didn't get a quick answer. This probably suggests that most of the active folks don't use parallel - indeed I didn't know it existed (which doesn't mean much - there are many Stata routines I don't know about). They also cannot really solve your problem since it probably lies in one of your do files. You can manually open multiple Stata sessions and run the do files that way if you can't get parallel to work.

    The error suggests that somewhere in running this it hit something with "business" as a value. It is possible the problem is in parallel.ado, but that seems unlikely since business doesn't sound like the kind of variable name or value such a program would use. However, you can easily open parallel.ado in the editor and search for "business".

    But, I'd look at your do files first. First, do your do files include something like business - a macro, a value in a variable, etc.? Do the do files run appropriately if you run them one at a time without parallel? If they run individually without parallel, then try running one at a time under the parallel operator, then two at a time, etc.

    You can use set trace on to have Stata echo each command including how it is interpreting each macro. It creates piles of output, but can help diagnose this kind of problem.

    Comment


    • #3
      Hi Giovanni,

      I'm not sure what could be the problem here. Have you tried updating parallel? Also, try typing
      Code:
      parallel viewlog
      after running your code to see what the error looks like within the instance. Brian has pushed a bunch of updates recently and parallel is now on version 1.19.0. If not, try updating it following the instructions here: https://github.com/gvegayon/parallel...n-latestmaster

      You can also try with other methods for running multiple do-files. Here is a couple of new examples on how to run multiple dofiles in parallel using parallel.

      If trying that doesn't help

      HIH

      Given a multiple_do_files.do with the following contents:

      Code:
      // The first instance runs dofile0 and dofile1
      if ($pll_instance == 1) {
      
          do ~/Desktop/dofile0.do
          do ~/Desktop/dofile1.do
          
      }
      else if ($pll_instance == 2) {
      
      // The second instance runs dofile2 to dofile4
          do ~/Desktop/dofile2.do
          do ~/Desktop/dofile3.do
          do ~/Desktop/dofile4.do
          
      }
      Code:
      clear all
      set trace off
      set more off
      
      // Alternative 1 ---------------------------------------------------------------
      // Create a program that calls the do-file depending on the stata instance
      // that is running (this is reflected in $pll_instance that goes from 1 to $PLL_CLUSTERS
      // in this case, 2.
      cap program drop multipledo
      program def multipledo
          do ~/Desktop/dofile$pll_instance`'.do
      end
      
      // Setting parallel with 2 clusters
      parallel setclusters 2
      
      // The nodata option asks parallel not to copy the data loaded in the master
      // (current) session
      parallel, nodata prog(multipledo): multipledo
      
      // Alternative 2 ---------------------------------------------------------------
      // Do the same, but you can create an ifelse statement. This is most useful
      // if you wish to, for example, run more than a single do-file per instance
      // (se the commented code below)
      cap program drop multipledo2
      program def multipledo2
          // The first instance runs dofile0 and dofile1
          if ($pll_instance == 1) {
          
              // do ~/Desktop/dofile0.do
              do ~/Desktop/dofile1.do
              
          }
          else if ($pll_instance == 2) {
          
          // The second instance runs dofile2 to dofile4
              do ~/Desktop/dofile2.do
              /*do ~/Desktop/dofile3.do
              do ~/Desktop/dofile4.do*/
              
          }
      end
      
      parallel, nodata prog(multipledo2): multipledo2
      
      // Alternative 3 ---------------------------------------------------------------
      // Similar to the program, you can write a single dofile that does all the work
      // and distributes the dofiles per call.
      parallel do multiple_do_files.do, nodata
      Last edited by George Vega; 31 Jul 2017, 12:11.

      Comment


      • #4
        Although this is an older thread, I had the same issue as Giovanni in using this command. However, I also figured out why it did not work (and gave a strange error code, in my case the word 'the').

        The problem occurs in that parallel does not accept folder directories with spaces in the names, so for example, setting the directory to "My STATA project" will give you an error. However, by renaming the folder as a single word, like "My_STATA_folder", the program runs just fine.

        Comment


        • #5
          Originally posted by Johan Karlsson View Post
          Although this is an older thread, I had the same issue as Giovanni in using this command. However, I also figured out why it did not work (and gave a strange error code, in my case the word 'the').

          The problem occurs in that parallel does not accept folder directories with spaces in the names, so for example, setting the directory to "My STATA project" will give you an error. However, by renaming the folder as a single word, like "My_STATA_folder", the program runs just fine.
          Many thanks! This solved my issue, too, as I put everything in a google diriver folder, and all that start with "My Drive". The error message I got is

          Code:
          invalid 'Drive'

          Comment

          Working...
          X