Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #76
    Billy, again, I agree with what you are saying, and I myself called it "a very bad option". But it serves a very practical purpose. The github install command doesn't force install by default. It's an option that users must specify and take responsibility for. It is enabling and dangerous, at the same time. But I think at the time being, given that there are over 200 Stata modules - that are only including Ado-file and help files - where only 50 of them are installable, this option can be enabling. I will start the GitHub Wiki documentation to demonstrate how to make installable package.
    ——————————————
    E. F. Haghish, IMBI, University of Freiburg
    [email protected]
    http://www.haghish.com/

    Comment


    • #77
      haghish It may be helpful to end users but at the expense of user-programmers not having their right to choose how their software is distributed respected. This is literally no different from the R world where I presume this was inspired. Regardless of whether or not a package in R is distributed through CRAN or on GitHub there are standards that are enforced to install a package. Stata has similar standards, which consist of generating a package file and a TOC file. Saying that user programmers may not be aware of the requirement is not a reasonable basis on which to by pass standards agreed to by the community and further shows that those individuals have not followed the guidance from Stata, the user community, and the help resources available online which all suggest to RTFM (e.g.,
      Code:
       help usersite
      ). Having a very bad option is very different from purposefully limiting the ability of the user community to choose how and when to distribute their source code. The only "practical" purpose in this case is subverting the efforts of someone working to develop code to choose when and how that code will be distributed and the "work around" that this provides violates all common practices across programming languages that traditionally require community members to adhere to coding standards and practices.

      Comment


      • #78
        Before I argue any further, please allow me to clarify a few points, which perhaps help to make a better understanding of the github package:
        1. github search command only shows Stata packages that are installable. You need an additional option called all to search for repositories that are not installable. Therefore, when you search GitHub, by default you only get installable packages
        2. github search command returns a table that allows you to click on the install hypertext and install the package with a click. This does not appear for repositories which are not installable. Not at all. users must type the install command and specify the force option.
        3. the search command ranks the results first by installablility, and then by the hits score.
        I explained this because I felt some might get the impression that I am adding an option to just install anything from the internet. Well, that's not the case.

        ***

        Back to your comments, I have to say your argument is convincing, and I agree with most of it. But I am not so sure if I agree with the following quote.

        Originally posted by wbuchanan View Post
        but at the expense of user-programmers not having their right to choose how their software is distributed
        This is a very sensitive topic that can provoke many (somthing I really wish to avoid). But because you mentioned it, we have to have a closer look at it to see if the force option is really doing it or not. I believe your argument cannot be applied to the force option and I explain why. I try to make myself understood, without provoking anyone. Perhaps, the safes is to reply with a question.
        • If you write an ado-file that runs a computation and put it in a public repository anywhere on the internet (not just github) and provide a "download" button for users to download the code (that includes a license in it, explaining the rights you preserve) from "that particular website" and grants the user the rights to "just use the program" - while presumably reserving all of the other rights - do you consider it a violation of your rights if somebody downloads your software? The force option is just a "download" made via Stata and is bound to the same rights.
        Again it is a very sensitive topic and simply saying "this violates programmer's rights" will provoke some people. Honestly, I was not happy that early in the discussion the arguments went out of control and beyond the purpose of the post's, where I wanted to discuss the benefits of using GitHub. I hope it doesn't get repeated. Despite my solid belief that the force option does not violate any right whatsoever, if StataCorp says this option is "violating the authors' rights" I will not hesitate for a second to remove it.
        ——————————————
        E. F. Haghish, IMBI, University of Freiburg
        [email protected]
        http://www.haghish.com/

        Comment


        • #79
          I think the golden thread here is that there is a difference between

          1. I wrote this program and make it public. Feel free to install it for your personal use. (It's still my program!)

          and

          2. This program is in the public domain and in practice nothing stops others redistributing it to third parties, but I as programmer don't approve or sanction that.

          As I see it,

          1 is wonderful.

          2 is where people get unhappy.

          Comment


          • #80
            haghish I'll try to clarify further as you did and provide what may be - perhaps - a more salient example. I've been working with my staff on using version control as part of their regular work flow for a variety of reasons, but have used GitHub as a way to decentralize and distribute the workload a bit. For example, we started working on some Stata commands to standardize publicly released datasets on school performance (repository for cleaning Kentucky school accountability report cards). When we finalize things we fully intend for the software to be freely available for others to use and hope that it will happen, since it is particularly difficult to wrangle all of the files effectively without a systematic approach. However, we wouldn't want people installing anything from the master branch at the moment because we've still been fixing bugs and adding functionality in a separate branch of the repository which others may not be aware of. Allowing the end user to decide whether or not that source code is ready for them to install - regardless of whether or not they are told it is a bad idea - robs me of the opportunity to use a free service to collaborate with colleagues without worrying that someone will follow that bad idea and install things that are not ready for distribution. I do want things I work on to be used by others and for credit to be given where it is due, but I don't want naïve end-users installing incorrect versions of something that is a work in progress or from a branch where I am potentially working on some refactoring/feature addition that breaks that code base.

            Granted, nothing stops them from doing it now, but it would at least require an end user who understands how Git works and presumably would understand the implications of using code from a branch that is distinctly different from the master branch. So I guess I would add a third thread to what Nick Cox mentioned which is a hybrid of the two already listed (e.g., I want people to feel free to use things, but I want to be the person who makes the determination of whether or not something is ready for distribution, not a "work around" to following documented procedures).

            Comment


            • #81
              Originally posted by Nick Cox View Post
              I think the golden thread here is that there is a difference between

              1. I wrote this program and make it public. Feel free to install it for your personal use. (It's still my program!)

              and

              2. This program is in the public domain and in practice nothing stops others redistributing it to third parties, but I as programmer don't approve or sanction that.

              As I see it,

              1 is wonderful.

              2 is where people get unhappy.
              Some thoughts/clarifications about this:

              "Free" can mean many things:
              1. Free to run a program without paying
              2. Free to access the code
              3. Free to redistribute verbatim copies (share the program with someone)
              4. Finally, free to distribute modified versions of the code.
              The typical licenses that allow these four freedoms are the GPL, MIT, and Apache licenses. Here the author still maintains copyright, but has given these rights away.

              Only at the very end of the spectrum, way beyond just "free", you will find public domain (AKA "unlicense" AKA creative commons 0).

              Since the Stata community (SSC, etc.) predates most of these licenses and most of the open source movement, most user-submitted packages do not state their license. For me that's a bit concerning, as by default a piece of code without a license means "all rights reserved", even the right to read or use the code.

              If you want to disallow either #4 or #3-4, you can either modify one of the standard licenses (risky), or pick one like this (see para 3).

              I personally do not have a take on what's the best license, and think that the authors can pick whatever license they want, or even keep the code to themselves if they desire. Still, even though most of my code is licensed with MIT, I haven't had any problem with users forking and redistributing the code (and yes, I would also be annoyed if someone finds an improvement and instead of submitting it to be included in the original code, they extend it and create confusion by having two competing versions).

              Of course, I've only written a couple of packages so YMMV.


              Cheers,
              S

              Comment


              • #82
                Sergio: You make very good points. I guess that I won't start worrying about what licence/license I am implying for my packages until someone shows that they really need to know.

                But being on SSC de facto implies

                Free to run a program without paying
                Free to access the code

                There is a further need in my view to spell out what exactly

                Free to redistribute verbatim copies (share the program with someone)
                Free to distribute modified versions of the code.

                do and do not mean. One dimension is not just morality but practicality, e.g. I have no way to monitor or control what others install or send to each other, but I can sometimes identify unauthorised, publicly available copies of, or modifications of, my code under the same program names, which I consider impolite and unhelpful.

                A while back a suggestion was made publicly (see http://www.stata.com/support/faqs/re...-faq/#relation), namely that

                User-written programs accessible from the StataCorp website have essentially the same status as those available from the SSC Archive. Their posting there is basically a matter of convenience to users and is not an official endorsement by StataCorp.

                However, StataCorp has nonexclusive rights to any program published in the STB or SJ, while anything placed in the SSC Archive is tacitly put in the public domain. In practice, you can probably take anything published in either medium and modify it as you will—especially if you do that privately—but publicly we recommend that, unless you are the original author, you change the name of the program, take all blame for any limitations your changes produce, and imply that a suitably large portion of the credit for the program belongs to the original authors.
                That's a long, long way short of the quasi-legal tightness of any license that I have ever glanced at, but it served as a guideline without any dissent that I can remember for about 15 years.



                Comment


                • #83
                  #78 Haghish wrote [edited slightly, but not deliberately changing meaning]

                  I was not happy that early in the discussion the arguments went out of control and beyond the purpose of the [thread], where I wanted to discuss the benefits of using GitHub.
                  Sorry, Haghish, on behalf of the forum (or at least whoever wants to associate themselves with this view), if you feel badly treated, but I see nothing here that passes the boundaries of civilised debate.

                  Conversely, anyone who does should raise it with the list administrators.

                  This is a technical forum, and any personal abuse would be utterly out of order, but I see none. In any forum all the other people have rights too, including the rights to

                  * disagree, and indicate the extent to which they think anyone is wrong, misguided, etc.

                  * change the subject a little by bringing other perspectives and issues, etc.

                  We should be able to have a lively discussion -- ideally, even a fruitful discussion -- without feeling personally offended. Your words "out of control" as a reference to other people's posts are about as candid as any other comment here, and that's fine by me too.

                  Above all, no one owns the threads they happen to start. It's characteristic of most debates that people might want to disagree radically as well as minutely.



                  Comment


                  • #84
                    No I didn't feel mistreated at all. But a person who reads the post may feel at some point the discussion gets a bit heated, which is very ok.

                    Sergio Correia good points about the licensing. I just add that GitHub allows you to select a license right when you create a repository, which is convenient.
                    ——————————————
                    E. F. Haghish, IMBI, University of Freiburg
                    [email protected]
                    http://www.haghish.com/

                    Comment


                    • #85
                      Billy here is another thought, although it is not a reply to your comments, but related. I know you're an avid GitHub user, so you're familiar with making "pre-release". As shown in the image below.

                      This is something I can also take into consideration in the github search command, which can communicate a message to the user that the current version is still under development. just a thought.

                      And now I share another thought. At the moment the main procedure of the github command is to install the package from the master repository. I remember we had a similar discussion via emails several months ago. I am thinking, that this procedure should change to install from the "latest release" instead of the master repository.

                      In the current version of github command, the package installs the "pre-release" version of MarkDoc, for example. However, with the new idea, the command will install the latest stable release. Probably, it would be best to add this functionality with a new subcommand , say "github stable" or "github stablerelease" (suggestions?). This would provide programmers the choice to spread the words for the proper way of installing their packages. where:

                      Code:
                      github install username/version
                      installs the master repository, and

                      Code:
                      github stable username/repository
                      Installs the latest stable release of the package

                      The programmer can make a suggestion for which way to go. If he uses the master branch for development, the the latter would be the proper way to go. Click image for larger version

Name:	image_6376.png
Views:	1
Size:	168.8 KB
ID:	1365707
                      Last edited by haghish; 25 Nov 2016, 08:17.
                      ——————————————
                      E. F. Haghish, IMBI, University of Freiburg
                      [email protected]
                      http://www.haghish.com/

                      Comment


                      • #86
                        Traditionally the master branch has been used to represent the production ready/release version of the code base. From the master branch, other branches are created to develop/test adding features and/or fixing bugs. That said, while I'm all for git and/or GitHub integration into the Stata workflow of other community members, I don't think it is a good idea to remove the requirement of a TOC and pkg file. If those files are present it gives the author control over the broader release of their programs. The other problem with taking this choice away from the developer is the backlash against the software that can occur if someone is using a still under development version of the software that has lots of bugs and broken functionality. Now other users would be potentially less likely to use or recommend the solution because they used/experienced a buggy version of the code that was not ready to be distributed to end users.

                        Comment


                        • #87
                          Thanks for your comments. I will give it some time to see how the package is used in practice. I just add that I assume people take their time to read the repository's description, especially if the search command doesn't allow to install the software with a mouse click. But what is needed right now is putting the package into practice and testing it.

                          ——————————————
                          E. F. Haghish, IMBI, University of Freiburg
                          [email protected]
                          http://www.haghish.com/

                          Comment


                          • #88
                            haghish that still removes the author of the content from the decision of how to announce and distribute the software among the community, to work around a process that is clearly documented in the Stata manuals. What value do you see being added by breaking an existing documented practice vs allowing only programs that include the requisite files to be installed by your package? From earlier posts it seems that you are making an assumption about the nature of user programmers and not being aware of the requirement to generate a TOC file and a pkg file in order to distribute their code vs considering the lack of inclusion as an intentional and purposeful decision.

                            Comment

                            Working...
                            X