Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to make Stata run faster?

    Hi,

    I am handling not that large ( ~ 3Go) datasets and Stata takes a long time to run for various minor operations. Beside adding RAM, having a faster processor and being careful with Stata syntax, is there any ways/tips to make Stata run faster? Thanks!

  • #2
    Maybe get an SSD. A lot of operations depend on tempsaves + use. Besides that, it depends on the specifics of your operations. Some stuff is indeed slow on Stata's part. For instance, see the graphs at the bottom of:

    https://github.com/matthieugomez/benchmark-stata-r

    Comment


    • #3
      Consider using sub samples. Especially when testing code.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 18.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Being careful with syntax or algorithmic choice is not always as straightforward as it looks in Stata. First, I'd suggest you give us some detail about your problem (e.g., commands or code fragments that are very slow for you) so that we might see if we have any concrete suggestions in that domain. I have seen simple tasks that are quite slow on large data sets, but which are amenable to substantial improvement with a bit of Mata code or other change.

        Second, I would point you to Joe Canner's presentation on "Optimizing Stata for ... Large Data Sets", as it shows substantial improvements from simple syntax choices that most of us would not anticipate.

        Third: Is your task possible something that could be divided and run in parallel, as would be true some bootstrap or similar process with repetitions? My experiences is that most such jobs can be run in about half the time using two instances of Stata, each running half of the reps., since two instances of Stata often will do the same job simultaneously in only about 10% more time than each running alone.

        I realize I'm somewhat slighting your restriction to solutions that don't involve syntax changes. Sorry, but I have the thought that there might be some possibilities.

        Regards, Mike

        Comment


        • #5
          Which flavor of Stata are you using? Since upgrading to an MP8 license I would have difficulty moving back to a single processor version of the software myself.

          Comment


          • #6
            I am using Stata/SE 13.1!

            Comment


            • #7
              I use Stata/SE 13.1 too, and I support Mikes 3. point if what you're doing is processing intensive (as opposed to alot of reading and writing to disk): Running Stata in parallel sessions, if possible. For instance, I tried a) letting one session of Stata loop over 10 levels of some variable (and doing something), and compared this to b) open Stata 10 times and let each session handle one level.
              In this particular experiment, the single-Stata session used about 12 % of the computers processor capacity and took 6,5 hours, while the 10 parallel Stata sessions used 99-100% of the processor capacity (8 cores) and took less than 1,5 hours. With more cores I suppose the parallel sessions would have been even faster.
              During multiple parallel Stata sessions, don't expect to do much else on your computer, though.

              Comment

              Working...
              X