Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to make source code of my mata functions in mlib readable?

    Hi,

    I write mata functions to a package I'm planning to publish. I put my mata functions into an mlib. I want the functions' source code (not just the object code) be available to anyone. However, the viewsource command results in an error ("file not found"), even though with the mata describe command the existence/name of the functions can be seen. Is there a solution to this, or the source code can only be shared in a separate ado file, without using an mlib? Any help would be much appreciated. See some code below.

    mata: mata mlib create lmylib, dir("`c(sysdir_plus)'l") replace
    mata:
    mata clear
    matrix myfunc(real matrix x, real scalar y, real scalar z)
    {
    real matrix xr
    real colvector missing_index
    real scalar c
    xr=x
    for (c=1; c<=cols(x); c++) {
    missing_index=(x[.,c]:==y):*runningsum(J(rows(x),1,1))
    missing_index=select(missing_index,missing_index:> 0)
    xr[missing_index,c]=J(rows(missing_index),1,z)
    }
    return(xr)
    } // end of myfunc function
    mata mlib add lmylib myfunc()
    end
    mata: mata mlib index
    mata: mata describe using lmylib
    viewsource myfunc

  • #2
    I have added the .do file that creates the Mata functions as ancillary files. For example if you look at ssc desc qenv you'll see two ancillary files qenv_mata9.do and qenv_mata10.do, that create the Mata functions. Noone needs to run those files in order to get qenv to work, they just give the source used to create the Mata libraries.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Depends on the author, but sometimes I am reluctant to use an ado if the Mata source code is not accessible.

      Comment


      • #4
        Nick: Thanks. Sure, I want to make my source code accessible precisely because I want feedback/check from other users - so that my code eventually become more credible and accepted. The question is how can I make the source code accesible if I use an object coded mlib. Maarten's suggestion is basicaly to give in paralell another, independent, "inactive" do file with the function definitions. If there is not a more direct way, I guess I'll have to do that.

        Maarten: Thanks.

        Comment


        • #5
          Szabolcs Lorincz the strategy I took with brewscheme was to provide the source directly to the end users with ado wrappers that they could use to manually recompile the source into an mlib file. In order to abstract that process away from the end users a bit I ended up using a Java plugin that gets the modification date of the .mlib file (if it exists) and if that date is earlier than the distribution date in the ado wrapper it is compiled for users automatically. You wouldn't necessarily need to go that far, but having an ado wrapper that builds the compiled library for them probably makes it easier and assuming there aren't any issues with the version being used by the end users makes it easier to maintain (e.g., you wouldn't need to maintain a versions 13 and 14 mata libraries if the source can be compiled in either version without issue).

          Comment


          • #6
            wbuchanan: Thanks, this is very useful. I was thinking about something similar. Maybe not the Java plugin but something simpler. Say, my main ado file checks whether the mlib exists. If not, it runs the wrapper. But you may be right that this does not solve the problem of update of the command. So, indeed the modification date of the mlib file should be interesting - is this only possible with the Java plugin? In any event, thanks, I'll check your submodule.

            Comment


            • #7
              Szabolcs Lorincz the Java plugin can be found here. You could potentially do it by shelling out, piping the output to a file, and then parsing the file, but there isn't a way that I'm aware of that keeps everything in memory. The Java plugin is also nice because it can help you to prevent issue with file permissions (e.g., if the user is unable to overwrite the file because of permissions the plugin also returns the writable file property). I don't think it would be quite as important if your initial goal is to have the code vetted and get feedback, but once it gets put into production I think it can be helpful for users to avoid needing to manually update/compile things.

              Comment


              • #8
                wbuchanan: Thanks. I have Stata 13 and cannot update (it's corporate). The command dir displays the modification date but it does not save anything in r(). The command dirlist does save it in a macro in r(). But dirlist indeed uses the shell command to invoke the operating system. This involves a flash of a small black opsys screen - I think that's rather unfortunate in case of, say, an estimation command. So, I'll go by your advice and first just implement a simpler solution to share the code. Many thanks for the help.

                Comment


                • #9
                  Szabolcs Lorincz no reason to update: filesys.ado. As long as Java 8 is on the system the .ado file is just acting as a wrapper around javacall. You shouldn't notice any shells opening/closing with this approach, but it has a small performance penalty in cases where the JVM is not already running,

                  Comment


                  • #10
                    I may be stating the obvious here but you do not have to compile your Mata code into a library in order to use it in a program you intend to distribute (see help m1_ado). You can put the Mata code in the ado and it will be compiled on the fly when the ado is loaded. At the clock speeds we are running today, I doubt if anyone would produce so much Mata code for a single package that it would generate a perceivable performance lag. An even then, that would occur only once, the first time the ado is loaded.

                    I would also point out that including the Mata source code is no guarantee to the user that it conforms to what was used to generate the library. The compiled Mata code could include malicious code or some phone home mechanism. I'm not suggesting that this is likely but it is not impossible. There is no Mata decompiler so it would be hard to scan for these.

                    Comment


                    • #11
                      Robert Picard those are good points. I think by providing the source and having it built on the users' system to can help to provide some of that assurance regarding malicious code. It also has the advantage of making it possible for end users to make modifications that may not be frequent enough for an author to include it in a more generalized setting (or if they need to access a private member or want to extend a class that previously was set as final/private).

                      Comment

                      Working...
                      X