Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is Mata at par with C in terms of speed?

    Since there is no discussion on this topic, I am posting this question here. Is Mata at par with C in terms of speed? If no, what can might cause the difference? Can somebody share his/her experience?

  • #2
    I cannot recall the source of this info (perhaps The Mata Book) but I vaguely remember that Mata is about 4 times slower than C++.

    Edit: skimmed through The Mata Book, could not find that staement. Perhaps others can comment. Obviously, the speed differences will depend somewhat on the specific task.
    Last edited by daniel klein; 04 Dec 2021, 04:12.

    Comment


    • #3
      It is not clear to me that this question is answerable at such a broad level.

      I suppose if you wrote simple code in Mata and C++ that utilized features common to both languages, and ran a comparison on a single-processor system, you would find that as an interpreted language Mata would be slower.

      But if you want to do something significant, like multiply two matrices, Mata has that built into the language. With C++ you're going to want to find and use a library of matrix routines, so the answer depends on the choice of library. Using Stata/MP, the matrix multiplication will automatically be parallelized across as many cores as your Stata/MP license supports, so the answer depends on your choice of Stata license.

      Comment


      • #4
        Thanks for your answers. Dear William, you said Mata is an interpreted language, isn't it a compiler?

        Comment


        • #5
          It is compiled into "object code", yes, but that object code is not machine code like that in an executable produced by C++, specific to a particular hardware architecture. The object code goes through a (more efficient) interpretive process when it is run by Mata.

          Comment


          • #6
            I find these competitions a bit irrational.
            C might be faster when programmed correctly. However, to be an experienced C-programmer takes quite a while.
            Before one gets where C-programmes are faster, things like garbage-collect help programmers do better programs in, e.g., Mata
            Kind regards

            nhb

            Comment


            • #7
              I tested some code to find mean of a vector of 5 elements. I repeated the calculations 100 million times. The time difference is substantial. Here is the code both in C and Mata

              Code:
              timer clear
              timer on 1
              mata
              a = 2,6,7,4,9
              n = 5
              
                 sum = 0;
                   for (z = 1; z< 100000000; z++) {
                       sum = 0
                       for(i = 1; i <= n; i++) {
                              sum=sum + a[i]
                       }
                       mean =  sum / n
                   }
                   mean
                   end
                   
                   timer off 1
                   timer on 2
                   cap prog drop cmean
              
                   cmean
                   timer off 2 
                   timer list
              
                        timer list
                 1:     84.29 /        1 =      84.2890
                 2:      1.02 /        1 =       1.0160
              // C code
              Code:
              #include "stplugin.h"
              #include <stdio.h>
              #include <string.h>
              
              STDLL stata_call() {
                  char         line[82];
                 float mean;
                 int sum, i, z;
                 int n = 5;
                 int a[] = {2,6,7,4,9};
              
                 sum = 0;
                   for (z = 1; z< 100000000; z++) {
                       sum = 0;
                       for(i = 0; i < n; i++) {
                              sum+=a[i];
                       }
                       mean = sum/(float)n;
                   }
              
                  sprintf(line, "Mean = %f\n", mean) ;
                  SF_display(line) ;
                 return 0;
              }
              Regards
              --------------------------------------------------
              Attaullah Shah, PhD.
              Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
              FinTechProfessor.com
              https://asdocx.com
              Check out my asdoc program, which sends outputs to MS Word.
              For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

              Comment


              • #8
                Attaullah Shah in post #7 confirms the first point I made in post #3, that a comparison of equivalent-looking Mata and C code will show Mata at a speed disadvantage, in his example, of a factor of 100.

                Below is an example that demonstrates that replacing the naive Mata loop used in post #7 with a built-in matrix function can increase by a factor of 50 the speed of the Mata solution from post #7 when applied to a larger problem.

                This supports the second point I made in post #2, that any comparison of speed ultimately rests on the problem addressed and the particular code used - and thus on the skill of the programmer making the comparison, as is implicit in Niels Henrik Bruun in post #6.
                Code:
                clear all
                local reps 100000
                local vec = 499
                
                // naive loop
                timer on 1
                mata
                n = `vec'
                a = 1..n
                for (z = 1; z< `reps'; z++) {
                    sum = 0
                    for(i = 1; i <= n; i++) {
                           sum=sum + a[i]
                    }
                    mean =  sum / n
                }
                mean
                end
                timer off 1
                
                clear mata
                
                // lose the loop
                timer on 2
                mata
                n = `vec'
                a = 1..n
                o = J(n,1,1/n)
                for (z = 1; z< `reps'; z++) {
                    mean = a*o
                }
                mean
                end
                timer off 2
                
                timer list
                Code:
                . timer list
                   1:      5.46 /        1 =       5.4640
                   2:      0.11 /        1 =       0.1090
                Last edited by William Lisowski; 05 Dec 2021, 13:36.

                Comment


                • #9
                  William Lisowski You were quick to find a quicker solution. Since the files were still open in my do editor, I plugged in your code and here are the results. For me, it is still a big difference.
                  Code:
                  . timer clear
                  
                  . timer on 1
                  
                  . mata
                  ----------------------------------- mata (type end to exit) -----------------------
                  : a = 2,6,7,4,9
                  
                  : n = 5
                  
                  : o = J(n,1,1/n)
                  
                  :    sum = 0;
                  
                  :          for (z = 1; z< 100000000; z++) {
                  >                 mean = a*o
                  >          }
                  
                  :          mean
                    5.6
                  
                  :          end
                  ----------------------------------------------------------------------------------
                  
                  .          
                  .          timer off 1
                  
                  .          timer on 2
                  
                  .          cap prog drop cmean
                  
                  .
                  .          cmean
                  Mean = 5.600000
                  
                  .          timer off 2
                  
                  .          timer list
                     1:     23.68 /        1 =      23.6780
                     2:      1.04 /        1 =       1.0380
                  
                  .          
                  .
                  .
                  end of do-file
                  Regards
                  --------------------------------------------------
                  Attaullah Shah, PhD.
                  Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                  FinTechProfessor.com
                  https://asdocx.com
                  Check out my asdoc program, which sends outputs to MS Word.
                  For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                  Comment


                  • #10
                    Yes, the average pass through the loop in Mata takes 230 nanoseconds, while the average pass through the loop in C takes just 10 nanoseconds.

                    Comment


                    • #11
                      And, yet another surprising finding is that the official mean() function of Mata is much slower.
                      Code:
                      . timer clear
                      
                      . timer on 1
                      
                      . mata
                      -------------------------------- mata (type end to exit) ---------------------------
                      : a = 2\6\7\4\9
                      
                      : 
                      :          for (z = 1; z< 100000000; z++) {
                      >                 mean = mean(a)
                      >          }
                      
                      :          mean
                        5.6
                      
                      :          end
                      -------------------------------------------------------------------------------
                      
                      .          
                      .          timer off 1
                      
                      .          timer on 2
                      
                      .          cap prog drop cmean
                      
                      . 
                      .          cmean
                      Mean = 5.600000
                      
                      .          timer off 2 
                      
                      .          timer list
                         1:    211.16 /        1 =     211.1590
                         2:      1.01 /        1 =       1.0070
                      
                      .          
                      . 
                      . 
                      end of do-file
                      Regards
                      --------------------------------------------------
                      Attaullah Shah, PhD.
                      Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                      FinTechProfessor.com
                      https://asdocx.com
                      Check out my asdoc program, which sends outputs to MS Word.
                      For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                      Comment


                      • #12
                        Not so surprising, the documentation tells us

                        mean(X, w) returns the weighted-or-unweighted column means of data matrix X. mean() uses quad precision in forming sums and so is very accurate
                        so Mata is doing something different than C.

                        And that is also the case in your earlier example. There Mata has stored a as real (double precision) numbers and is doing calculations in double precision, while C has stored a as integers and is presumably doing its calculations in integer arithmetic.

                        Comment


                        • #13
                          I do not deny that in the example made by Attaullah Shah in #9, C is faster.
                          However, in that example, things like garbage collection mean little.
                          And if the world were only about finding means, I would undoubtedly recommend C.

                          The same discussion has been made for years about Python and C.
                          I recall that even though python code was slower, development time for serious code was much higher for C.
                          And also more error-prone.

                          So before I start thinking about using C, I would consider the following questions:
                          • Do I need the extra speed?
                          • Do I want to learn a programming language with a steep learning curve to get to my solution?
                          • Is the Mata environment with, e.g., matrix functions sufficient for me? I.e., so I don't need to find and learn the necessary C libraries to do the job
                          William Lisowski made a good point in #8, showing that learning Mata better is better than learning a second language.
                          On the other hand, if you are good at C already, that might be a good solution.
                          Kind regards

                          nhb

                          Comment


                          • #14
                            Thank you William Lisowski and Niels Henrik Bruun for your valuable inputs. I am a fan of Stata and Mata for their simplicity and power. Yet, there might be situations where we might need C, python, or some other languages, and which is why Stata Corp added support for integration of these languages into Stata.
                            Last edited by Attaullah Shah; 06 Dec 2021, 01:02.
                            Regards
                            --------------------------------------------------
                            Attaullah Shah, PhD.
                            Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                            FinTechProfessor.com
                            https://asdocx.com
                            Check out my asdoc program, which sends outputs to MS Word.
                            For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                            Comment


                            • #15
                              Thank you all for your replies.

                              Comment

                              Working...
                              X