Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple Instances of Stata - parallel

    I need to run many do files concurrently. The number of cores of my PC is 28, with 128 RAM. For whatever the reason, Stata never uses all the cores, irrespective of the task (I tried with different types of tasks, datasets, etc). It always uses up to 12 cores. My Stata license is "64-MP" cores. I do not understand why stata underuses the CPU.

    Any help? maybe JanDitzen or Alan Riley (StataCorp) ?

  • #2
    Dear Bladimir,

    This is a rather complicated technical issue that revolves around the configuration of your computer, i.e. the design of your CPU and the Stata license that you have.
    Let me explain this with my own case. I use a rather 'new' Win11 system and a Stata MP12 license.
    My system has an Intel 13th Gen. CPU i9 13900k CPU, which is at the upper end of what money can buy for personal computing.
    So, my expectation was that Stata would be spinning like mad using '12 cores' having 16 cores available on my CPU.
    But this is not the case.I cannot get 12 cores to run Stata code but only 8 cores.
    This is because this CPU has (only) 8 so-called Performance-cores and not anyone of the 16 so-called Efficient-cores are used by Stata.
    (Tech. support explained that to me.)

    So, you have to check the specifications of your 28 cores CPU.
    My gist is that it has only 12 of these so-called Performance-cores and that the other 16 are so-called Efficient-cores.
    'Google' explains: Simpler in structure compared to P cores, E cores consume less power and produce less heat. This design ensures that the CPU can function efficiently for many everyday tasks without drawing unnecessary power or generating excessive heat.

    The implication is that you have to consider the number of Performance-cores while investing in a CPU and the Stata MP license to run on it.
    http://publicationslist.org/eric.melse

    Comment


    • #3
      Very interesting. I wonder if this is an Intel thing? I have checked a popular online shop and indeed, there the Intel CPU is listed with "performance" cores. For AMD Ryzen chips, I have not seen this. If money is not the issue, the AMD Ryzen Threadripper 3970x apparently offers 32 "real" cores.
      Best wishes

      (Stata 16.1 MP)

      Comment


      • #4
        Thanks! ericmelse That is exactly the problem. My PC has 8 P + 12E cores, and 28 threads due to hyper-threading. Is there some way for Stata to use all cores?

        Comment


        • #5
          Here's how I usually do this on our Linux server with 80 cores, but this approach also works on Windows machines with small adjustments.


          First I prepare a master do-file that calls the individual instances of Stata:
          Code:
          *-----------------------------------------------------------------------------------------*
          *Prepare file to work with
          *-----------------------------------------------------------------------------------------*
          webuse auto, clear
          save auto, replace
          
          *Add what you want to do here
          
          clear
          set obs `1' /* depending on the number of cores you have */
          gen i_order = _n
          
          gen beta = . /* just an example */
          
          
          *----------------------------------------------------------------------------------------*
          *Start `1' instances of Stata
          *----------------------------------------------------------------------------------------*
          
          gen random = runiform()
          egen group = cut(random), group(`1')
          
          
          
          replace group = group +1
          
          sum group
          local min=r(min)
          local max=r(max)
          
          forvalues g=`min'/`max' {
              preserve
              keep if group==`g'
              save "group_`g'", replace
              *Copy the auto.dta file
              copy auto.dta auto_`g'.dta, replace
              winexec stata-mp -q do calc_whatever.do `g'  /* <--- this will start the individual do files */
              restore
          }
          
          clear
          *----------------------------------------------------------------------------------------*
          *Let the master wait until everything is finished
          *----------------------------------------------------------------------------------------*
          
          clear
           forvalues g=`min'/`max' {  
             capture confirm file finished_`g'.dta
             while _rc != 0 {
                sleep 2000
                capture confirm file finished_`g'.dta
             }
           }
          
          forvalues g=`min'/`max' {
              capture erase finished_`g'.dta
          }
          
          *----------------------------------------------------------------------------------------*
          *Combine the results & clean up
          *----------------------------------------------------------------------------------------*
          
          use group_`min', clear
          drop in 1/l
          
          forvalues g=`min'/`max' {
              append using group_`g'
              cap erase group_`g'.dta
              cap erase auto_`g'.dta
          }
          
          drop group random
          
          sort i_order
          
          *Done
          save  whatever, replace
          exit
          This do file will start as many individual instances of Stata as you like.

          The actual calculations are being done in the following do-file called 'calc_whatever.do'.
          Code:
          local g `1' /* <-- this is the group number */
          
          use group_`g', clear
          
          
          *Just an example
          use auto_`g', clear
          
          set seed `g'
          gen random = runiform()
          
          reg price random
          
          *Collect results
          local beta = e(b)[1,1]
          
          *Store results
          use group_`g', clear
          replace beta = `beta' in 1
          save group_`g', replace
          
          *----------------------------------------------------------------------------------------*
          *When finished: generate finished file and exit
          *----------------------------------------------------------------------------------------*
          
          clear
          set obs 1
          gen v=1
          save finished_`g', replace
          
          clear         
          exit, STATA
          exit
          When an individual instance is finished, it stores a dta file on disk called 'finished_`g'.dta' where `g' is the corresponding group number.
          If all such files are stored, the master do-file will collect the prepared results and clean up the unused files.

          So on our machine, I would start the process with

          Code:
          do call_whatever 60
          This would then start 60 instances of Stata in parallel

          HTH

          Ali

          PS: There's also a package by Vega Yon & Quistorff that I never tried, but that, I think, is based on similar logic

          Comment


          • #6
            PPS: By adding
            Code:
            set processors `n'
            at the beginning of 'calc_whatever.do', where `n' is a number of your choice, you can control the number of processors or cores Stata/MP uses.
            Last edited by Alexander Koplenig; 02 Jul 2024, 09:49.

            Comment


            • #7
              Hi Alexander Koplenig . I already do something similar. But in practice, even when I run 20 stata instances as you do, Stata only uses 12 cores and not 20.

              Comment


              • #8
                I have found the same problem with many programs, the (preliminary) answer is to run them in admin mode, that way it uses all cores. I guess there is a better way to force the Windows Scheduler, if someone can find it...

                BTW, this happens in Intel CPUs due to the heterogeneous cores architecture (BIG.little) which require Windows Scheduler to decide which cores to use. This can happen in AMD cpus that also have heterogeneous cores, like those with 3Dcache or the Zen4c cores.

                BTW2: apparently this can be set in task manager by setting the affinity of Stata to all cores.

                " Set Processor Affinity for Stata:

                1. Open Task Manager:
                  • Press Ctrl + Shift + Esc to open the Task Manager.
                2. Find Stata:
                  • In the Task Manager, go to the “Processes” tab.
                  • Locate Stata in the list of processes.
                3. Set Affinity:
                  • Right-click on the Stata process.
                  • Select “Go to details” (this will take you to the “Details” tab and highlight the Stata process).
                  • Right-click on the highlighted Stata process in the “Details” tab.
                  • Select “Set affinity.”
                4. Select All Cores:
                  • In the Processor Affinity window, check all the cores (or the ones you want Stata to use).
                  • Click “OK” to apply the changes.
                "
                Last edited by alejoforero; 02 Jul 2024, 15:31.

                Comment


                • #9
                  Bladimir Carrillo , what is the output from

                  Code:
                  set processors
                  For example, on my Stata 18 MP, the output is:

                  Code:
                  . set processors
                      The maximum number of processors or cores being used is 4.  It can be set to any number between 1 and 4.

                  Comment


                  • #10
                    Hi alejoforero . Fantastic suggestion! I will try that!

                    Hi Hua Peng (StataCorp) , the output of set processors is "The maximum number of processors or cores being used is 1. It can be set to any number between 1
                    and 28." When I set it to any number >1, stata simply shut down for some reason I do not understand.

                    Comment


                    • #11
                      "When I set it to any number >1, stata simply shut down for some reason I do not understand", it seems Stata/MP is not configued correctly on your machine. Please contact tech support at [email protected] so we can figure out what the issue is.

                      Comment

                      Working...
                      X