Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How Stata fails

    Examples of how Stata fails when trying to catch up

    Tools in statistics improve constantly. In some of these areas, Stata is good. In others, Stata is simply not catching up, but still pretending to do.

    I recently maintained that Stata's functions for literate programming are of little use once we compare it with what is available for free in R with the knitr package. (And one can use R with RStudio and knitr as an excellent platform for running Stata functions. Please note that some programmers are improving Stata's functions in this field, though the gap to R's knitr is still wide.)

    But take something relevant for more people: SEM. Stata is extremely slow and is frankly speaking unusable for serious SEM. Again, the free R is better. I see some users suggest Stata should buy Mplus (the best in the field), which I believe is unrealistic, but it does indicate that Stata fails in SEM.

    Moreover, StataCorp as a publisher allows a book to present incorrect claims about model fit in SEM. A recently published book presents a SEM model that clearly fails, but the authors maintain the model fits the data. (If I remember correctly, the authors maintained that a RMSEA = .10 indicates good fit. It certainly doesn't and it easy to see what is missing in the model.) I'm not criticising the authors, they are obviously unfamiliar with SEM and SEM occupies only a chapter in the book. But StataCorp is the publisher, what happened during review and quality control?

    Finally, StataCorp recently gave a video-based lecture in Bayes. Again, the free R is much better, and again, Mplus may for many users be the best. But my concern is another. Originally, Stata allowed users only one Bayesian chain. Never run Bayes with one chain only. Use at least two (I use four). But in the video, Stata is back to lecturing in Bayes with only one chain. Again, where is the quality control in StataCorp? I don't think we can claim that since this was an introduction only, we can skip one of the most essential parts of Bayesian modelling.

    I love Stata for what it does best. None of the above, and not graphics (the free R is again much better). I love Stata for its language and its data management. And whenever I code in R I miss Stata's simplicity and the local macros.

    I think Stata is in trouble. I think Stata has the wrong approach when Bayes is introduced by programmers who do not first learn modern Bayes. I think the inabilities of Stata to run SEM speaks volumes. And I think there is too little quality control in Stata for books and maybe for videos on modern techniques.

    It's one of those unhappy love affairs. It's not that I haven't tried. I fell in love with Stata; Stata is easy to learn, with a beautiful, human-like language. But that love is not mutual. Stata doesn't care about what I need. Anyone wanting to use SEM, Bayes, or literate programming should look elsewhere.

    Or, could that change?

    http://staffblogs.le.ac.uk/bayeswith...13/stata-vs-r/
    Last edited by sladmin; 11 Dec 2017, 08:30. Reason: anonymize poster

  • #2
    But take something relevant for more people: SEM. Stata is extremely slow and is frankly speaking unusable for serious SEM. Again, the free R is better. I see some users suggest Stata should buy Mplus (the best in the field), which I believe is unrealistic, but it does indicate that Stata fails in SEM.
    I agree that Stata sem is slow compared to the competition, and seems to have more convergence problems. There have been significant speed improvements starting with Stata 14.2.

    If you are talking about R's lavaan, I find it is quicker. But, some of the fit stats appear wrong (they differ from both Mplus and Stata) and the code required for some things seems excruciatingly tedious.

    My xtdpdml program (available from SSC) will generate code (for the specific models it estimates) for Stata, Mplus, and Lavaan, which makes it easy to compare the three.

    Even if those other programs are faster, I find post-estimation commands and ereturned results a very nice feature of Stata.

    I am semi-serious about Stata buying out Mplus! The authors are getting old, so it wouldn't surprise me if they someday sold out to somebody. I don't know how Mplus does it, but it seems to run circles around all the competition, not just Stata.

    I won't disagree with most of the things you say. But, for the most part, I find Stata quick and easy to use. Other programs may run faster once you get them running, but there can be a huge amount of labor involved before you are ready to get something running, and Stata keeps that labor (for me at least) down to a minimum. I love the program, and hope that it continues to improve along the lines that you and others suggest.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      This has little to do with your dissatisfaction with some of Stata's features or performance, but I kind of wonder about the "quality control" aspects in your post. Concerning your SEM example

      StataCorp as a publisher allows a book to present incorrect claims about model fit in SEM. A recently published book presents a SEM model that clearly fails, but the authors maintain the model fits the data. (If I remember correctly, the authors maintained that a RMSEA = .10 indicates good fit. It certainly doesn't and it easy to see what is missing in the model.)
      Now, I am all but an expert on SEM but it seems to me that the acceptable values for various model fit indices vary from source to source. I have come across recommendations for RMSEA in the range from lower than 0.05 up to 0.1, depending on author and probably filed of studies. So whoever wrote the book/chapter that you are referring to, is probably in good company. I also have the strong impression that these cut-off values are just rules of thumbs, anyway. So I wonder about your criteria for "incorrectness". Incorrect in which sense? Clearly, not in any mathematical/statistical way or am missing something?

      Best
      Daniel
      Last edited by daniel klein; 02 Dec 2017, 06:52.

      Comment


      • #4
        Also, maybe I'm wrong. but it seems like a lot of these pro R/ anti-Stata complaints come from people in the natural sciences or people who have some monster data set to analyze or who have some other need for highly advanced methods. Me, and I suspect the majority of other social scientists, are quite happy with what Stata can do for us. Give me a turbocharged sem program and maybe speed up other things and I won't feel any need to wander elsewhere.

        For me personally, I would rather see Stata speed up and improve what it currently has rather than add a bunch of new statistical techniques that I will never use.

        Also, some posts say that the death of Stata is imminent -- my guess is that Stata is doing pretty well sales-wise. And last I saw, the use of Stata in published articles was increasing. There may be niches where Stata is losing out, but I'm doubtful that it is losing out overall.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Daniel. The "best" cutoff for RMSEA is .05, alternatively .06 (this depends on the source you read, and you should consult recent publications and authorities in the field). I sometimes "accept" a RMSEA up to .08 if the model has few degrees of freedom (e.g. a factor model with only four indicators and maybe two of them with correlated residual variables). A RMSEA of .10 clearly fails, and that should normally be evident by other indices of fit as well. And be sure to consider the Chi-square. If it's significant, why? A friend of mine told me Jöreskog had pointed out that the 90% CI for the RMSEA should not exceed .08. I think this might sometimes be difficult with small sample sizes, where power is low (CI reflects power). But then, SEM needs large sample sizes.

          @Richard. I'm glad for your support. (I have problems with Stata's SEM even with sample sizes of less than two thousand.) Bengt Muthén and his wife Linda may be getting relatively old, but then you have the younger Tihomir Asparouhov who is now equally important as Bengt for developing Bayes and other advanced features in Mplus. But... I am not sure Tihomir would want to run the company, so maybe your idea is more realistic than I thought. SPSS dropped its connection to Lisrel and bought Amos (and Amos is stuck somewehere in the past.) So... could Stata buy Mplus?

          If Stata bought Mplus, I think they would still be separate programs, with different programming philosophies (Linda and Bengt have developed a sophisticated language that would be very difficult to change and, I think, no one should try to change it.

          So where exactly is the benefit? I doubt Stata and Mplus would be well integrated. And I don't see why it would be cheaper for you as a user? I'm intrigued by your idea, but on second thought, I'm unsure what improvements this would bring. Stata could allow for better integration without buying other companies, that includes integration with R.

          Comment


          • #6
            Let's keep personal comments out of this. I don't know the authors of M-Plus but that's immaterial. I don't see that Statalist is well served by ageist remarks and gossip about other individuals. People on the list can and do banter with and about each other on the list, and even argue about technical matters, but I suggest that people outside are off-limits.

            The "best" cutoff for RMSEA is .05, alternatively .06
            I must bear that in mind if I get involved in SEM.

            Comment


            • #7
              @Nick
              I am a researcher on ageism, so I have to ask you: Was my answer to Richard's point about the future of Mplus really ageist?

              I tried to answer an argument brought up by Richard. It was well intended. It's a pretty harsh remark I now received -- "ageist remarks and gossip about other individuals". I am sorry for getting into that point about Mplus' future.
              Last edited by sladmin; 11 Dec 2017, 09:53. Reason: anonymize poster

              Comment


              • #8
                Guest: I am making a personal suggestion about what to avoid in discussion as unhelpful. If you think that the people concerned would not be upset by what you said or find it inappropriate, then that indicates what you regard as acceptable .
                Last edited by sladmin; 11 Dec 2017, 08:31. Reason: anonymize poster

                Comment


                • #9
                  I'm sorry if my comment seemed ageist -- I probably should have just said that the authors (who aren't that much older than I am) have been doing this a long time and might be interested in selling out. But, if they want to go another 30 years, then more power to them.

                  However, if I was the head of Statacorp or some other major Stats company, I suspect I would send out some informal queries to see if there is interest on their side. If somebody else buys/clones Mplus, they are going to have a huge advantage with SEM. Companies buy technology all the time when what others have is superior to what they've got.

                  Guest, I'm assuming that Mplus's speed advantages have something to do with the algorithms or coding used. I would be quite happy to keep Stata's syntax if you could combine it with Mplus's speed. Or, you could make Mplus's syntax more consistent with Stata's.
                  Last edited by sladmin; 11 Dec 2017, 09:53. Reason: anonymize poster
                  -------------------------------------------
                  Richard Williams, Notre Dame Dept of Sociology
                  StataNow Version: 19.5 MP (2 processor)

                  EMAIL: [email protected]
                  WWW: https://www3.nd.edu/~rwilliam

                  Comment


                  • #10
                    It's one of those unhappy love affairs. It's not that I haven't tried. I fell in love with Stata; Stata is easy to learn, with a beautiful, human-like language. But that love is not mutual. Stata doesn't care about what I need. Anyone wanting to use SEM, Bayes, or literate programming should look elsewhere.
                    More generally, there is a robust market in software for statistical analysis. I'm pleased that Stata has a view of its intended audience and a focus on what it understands their needs to be. It meets my needs imperfectly, but it does better at meeting them than do the alternatives, including my previous favorites, suggesting to me that I am in their target audience. And the alternative - for Stata to attempt to be all things to all users - is an unsustainable business model.

                    There's nothing wrong with choosing a different software package that better meets your needs. If you value SEM, Bayes, and literate programming over Stata's strong points, that is your call, and I'd encourage you to follow your needs, leaving Stata to focus on its target audience.

                    What I disagree with, however, is the implication of the title to this thread - "How Stata fails". A more accurate, and self-aware, title would be "How Stata fails to meet my expectations".

                    Comment


                    • #11
                      Originally posted by Richard Williams View Post
                      Me, and I suspect the majority of other social scientists, are quite happy with what Stata can do for us.
                      As a biological scientist, who as a generality tend not to be mathematically inclined, I'd agree if there were better support for nonparametric post hoc analysis, dose-response, unbalanced datasets, and perhaps a little more thought given to consistency and workflow.

                      But, in my opinion, R shows definite signs of many hands and little co-ordination. Sometimes it appears almost wilfully arcane.
                      Stata 14.2MP
                      OS X

                      Comment


                      • #12
                        But, in my opinion, R shows definite signs of many hands and little co-ordination. Sometimes it appears almost wilfully arcane.
                        I don't know R that well so I can't really say, but I've heard similar comments from others. You get a lot of great things for free, but consistent user interface is not one of them.

                        Your comment reminds me of my first program, SPSS. I always thought that SPSS looked like a collection of 50 different routines written by 50 different people who had 50 different ideas what the syntax should be. With Stata, I have a fighting chance of getting the syntax right for an estimation command even if I am not that familiar with it. With SPSS, no way can I do that. I need the menus or some sample code I can copy and paste. I hardly ever need the menus in Stata (except maybe for graphics) and the help files generally meet my needs.

                        I suspect R appeals to a lot of people who might have become programmers in another life, e.g. they actually like coding. But for people who just want the results, Stata is often more appealing, at least in those areas where Stata can easily get them what they want.
                        -------------------------------------------
                        Richard Williams, Notre Dame Dept of Sociology
                        StataNow Version: 19.5 MP (2 processor)

                        EMAIL: [email protected]
                        WWW: https://www3.nd.edu/~rwilliam

                        Comment


                        • #13
                          Incidentally, getting really good with R is one of the things I would like to do in retirement. Along with learning French, Spanish, and how to play the piano. Before then, though, I don't seem to have sufficient motivation or time.
                          -------------------------------------------
                          Richard Williams, Notre Dame Dept of Sociology
                          StataNow Version: 19.5 MP (2 processor)

                          EMAIL: [email protected]
                          WWW: https://www3.nd.edu/~rwilliam

                          Comment


                          • #14
                            William Lisowski If you read my post, you will se that I did not claim that Stata generally fails.

                            The post is summarised at the start:

                            Tools in statistics improve constantly. In some of these areas, Stata is good. In others, Stata is simply not catching up, but still pretending to do.
                            "... but still pretending to do" is the crucial part of that statement.

                            If Stata did not do SEM, I would have no objections. How many hours have I spent/wasted trying to get Stata to run SEM models? (I have upgraded Stata twice for my own money based on promised features but realised after hours of work they did not work as expected and I should have sticked with my original Stata version.)

                            Also, I would have no objections if Stata did not introduce Bayes. But it did, and with one MCMC chain only. That is fail, I'm sorry to say so, and StataCorp admitted that once I made them aware of the problem (see also the new manual for Bayes in Stata). So StataCorp does listen to critical input.

                            Stata's "view of its intended audience" actually includes me, and yes, for Stata I am in "their target audience", as confirmed by a Stata programmer I talked with. It is criticism that helps us progress, as Nick's criticism of me earlier in this thread reminded me what to avoid writing about.

                            I am glad Stata meets your needs, but I also think we should see the merits in constructive criticism for StataCorp, which is one of the sources StataCorp can draw upon when improving their product.

                            For most basic functions, Stata is second to none. It also does several types of analyses very well, but not the heavily advertised SEM and its first introduction to Bayes (I haven't tried Bayes in Version 15).




                            Comment


                            • #15
                              Originally posted by Richard Williams View Post
                              I always thought that SPSS looked like a collection of 50 different routines written by 50 different people who had 50 different ideas what the syntax should be.
                              On the other hand, Stata isn't without its idiosyncrasies. The by and over options seem to be used synonymously, but not interchangeably; then sometime neither will work, and bysort is required. It would make much easier if only one syntax was required to get things working.
                              Stata 14.2MP
                              OS X

                              Comment

                              Working...
                              X