Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • I would like to see a supported Stata kernel for Jupyter, there are a few out there, notably:
    https://github.com/kylebarron/stata_kernel
    I think this is something Statacorp should integrate, (re)distribute and support going forward (it's GPL v3), and build upon, conjointly to (keep) opening stata to the outside world (python integration, plugins, etc.); so data scientists can easily integrate it in their workflows and would facilitate peer reviews, git integration, pull requests, and publishing on corporate wikis, meetings, etc.

    Comment


    • #119 #120 Making generate and replace r-class would break many commands and do-files. The point is not that the results could be ignored; they would appear regardless and zap other r-class results. Even keeping classic behaviour under version control would be widely not understood or forgotten. Unless you’re volunteering to answer all the puzzled threads arising from such a change this isn’t an attractive idea.

      What would seem defensible to me would be new commands that were r-class. Then users who wanted them would have different behaviour.
      Last edited by Nick Cox; 23 Feb 2020, 08:55.

      Comment


      • #122 Good point. Instead of making generate and replace r-class, a local() option could be added to each to generate local macros for the reported results. For replace one might want to be able to specify two local names - one for the total number of replacements, and a second for the number of missing values. The principle remains the same: if it's accessible to copy-and-paste it should be accessible to the program. (And not by creating and parsing a log file to rule out the obvious hack .)

        Comment


        • I noticed that -twoway has an upper limit of lines when overlaying, perhaps due to color palette limitations. I had to plot 20+ lines and I think it only displayed the first 15 and dropped all others. Maybe that is something that can be improved in the next version.

          Comment


          • re: #124 - the limit (see help limits) is 100 variable and 20 styles - maybe you need to open a new topic and ask a question; be sure to show the code you used

            Comment


            • re: #125
              I saw from other posts that this can be circumvented by playing around with color and solid/dotted line combinations. It worked in my case, but it was time consuming to specify the color and line combinations individually in the code. I thought maybe it would be a good suggestion to add it here.

              Comment


              • Extending Stata's capabilities to support function application methods such as Python's map or R's lapply would be a value add to aggregating results in the way similar to that of the addition of data frames in V16 has been to working with multiple datasets simultaneously.

                In particular, being able to capture any [e]returned result from a command into a list-like object that is in memory that would not need to be restored or saved in a .ster file "on the fly".

                As a basic example consider the following:

                (note: uses Stata V 16)

                Code:
                . sysuse nlsw88
                (NLSW, 1988 extract)
                
                . unab vlist: _all
                
                . python
                ----------------------------------------------- python (type end to exit) --------------------------------------------------------------------------------
                >>> import sfi
                >>> def summarize(x):
                ...  sfi.SFIToolkit.stata("summarize " + x)
                ...  return( [sfi.Scalar.getValue("r(N)"), sfi.Scalar.getValue("r(mean)"), sfi.Scalar.getValue("r(sd)"), sfi.Scalar.getValue("r(min)"), sfi.Scalar.getValu
                > e("r(max)")])
                ...
                >>> nlsw88_sum = dict(zip("`vlist'".split(" "), list(map(summarize, "`vlist'".split(" ")))))
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                      idcode |      2,246    2612.654    1480.864          1       5159
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                         age |      2,246    39.15316    3.060002         34         46
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                        race |      2,246    1.282725    .4754413          1          3
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                     married |      2,246    .6420303    .4795099          0          1
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                never_marr~d |      2,246    .1041852    .3055687          0          1
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                       grade |      2,244    13.09893    2.521246          0         18
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                    collgrad |      2,246    .2368655    .4252538          0          1
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                       south |      2,246    .4194123    .4935728          0          1
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                        smsa |      2,246    .7039181    .4566292          0          1
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                      c_city |      2,246    .2916296    .4546139          0          1
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                    industry |      2,232    8.189516    3.010875          1         12
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                  occupation |      2,237    4.642825    3.408897          1         13
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                       union |      1,878    .2454739    .4304825          0          1
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                        wage |      2,246    7.766949    5.755523   1.004952   40.74659
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                       hours |      2,242    37.21811    10.50914          1         80
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                     ttl_exp |      2,246    12.53498    4.610208   .1153846   28.88461
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                      tenure |      2,231     5.97785    5.510331          0   25.91667
                >>> nlsw88_sum
                {'idcode': [2246.0, 2612.654496883348, 1480.8637634568668, 1.0, 5159.0], 'age': [2246.0, 39.15316117542297, 3.0600022239430684, 34.0, 46.0], 'race': [2246
                > .0, 1.2827248441674086, 0.47544129024449705, 1.0, 3.0], 'married': [2246.0, 0.6420302760463046, 0.4795099307555556, 0.0, 1.0], 'never_married': [2246.0,
                >  0.10418521816562779, 0.30556870120775137, 0.0, 1.0], 'grade': [2244.0, 13.098930481283423, 2.5212460945811133, 0.0, 18.0], 'collgrad': [2246.0, 0.23686
                > 553873552982, 0.4252537737781529, 0.0, 1.0], 'south': [2246.0, 0.41941228851291185, 0.4935727773212602, 0.0, 1.0], 'smsa': [2246.0, 0.7039180765805877,
                > 0.45662923067852623, 0.0, 1.0], 'c_city': [2246.0, 0.29162956366874443, 0.45461387997400726, 0.0, 1.0], 'industry': [2232.0, 8.189516129032258, 3.010874
                > 8568471775, 1.0, 12.0], 'occupation': [2237.0, 4.642825212337953, 3.4088972128545767, 1.0, 13.0], 'union': [1878.0, 0.24547390841320554, 0.4304824567422
                > 844, 0.0, 1.0], 'wage': [2246.0, 7.76694903741006, 5.755522859382768, 1.00495183467865, 40.74658966064453], 'hours': [2242.0, 37.218108831400535, 10.509
                > 135117595422, 1.0, 80.0], 'ttl_exp': [2246.0, 12.534976707079771, 4.6102075341192625, 0.11538461595773697, 28.884614944458008], 'tenure': [2231.0, 5.977
                > 849999269874, 5.510331212404582, 0.0, 25.91666603088379]}
                >>> end
                ----------------------------------------------------------------------------------------------------------------------------------------------------------
                
                .
                It's a basic example, but captures, by variable name, all the summarized data. Additionally, one can refer to elements later by variable name like:

                Code:
                . python: nlsw88_sum['age']
                [2246.0, 39.15316117542297, 3.0600022239430684, 34.0, 46.0]
                This capability in lapply is a very nice feature of R (in my view) and can clearly be accommodated in the new Python integration using map, but it would be nice to have native to Stata and/or Mata.

                - joe
                Last edited by Joseph Luchman; 24 Feb 2020, 15:36.
                Joseph Nicholas Luchman, Ph.D., PStat® (American Statistical Association)
                ----
                Research Fellow
                Fors Marsh

                ----
                Version 18.0 MP

                Comment


                • Is there any way to select all of same word in do editor? Like notepad++ when we select a word it highlight all same words in the editor. It will be easy to trace the occurrence of same word in whole do file.

                  Best regards,
                  Rasool Bux

                  Comment


                  • Originally posted by Christopher Bratt View Post
                    My main concern would be an improved approach to reproducible research, with a flexility similar to Rmarkdown and the knitr package in R/RStudio.

                    I also second the hope for much improved speed in SEM analyses.
                    Check out -putdocx- and -putpdf- from Stata and the -texdoc-, -webdoc-, and -markstat- packages from SSC. Of these, -texdoc- is my preferred tool.
                    Eric A. Booth | Senior Director of Research | Far Harbor | Austin TX

                    Comment


                    • My wish to add button in data browser/editor window for show/hide value labels or there must be an permanent option (check box) for this in Edit --> General Preference--> Data Editor. Currently we can do it by command browse, nol. Is there any way to show variable label like a tip when moving cursor/mouse on column?

                      Best regards,
                      Rasool Bux

                      Comment


                      • Originally posted by William Lisowski View Post
                        In reaction to #119, with which I agree, I note that it is a particular example reinforcing the general principle expressed at #93 and discussed in the subsequent posts.
                        Ah, I missed that post - apologies to Rene Macon, I very much agree!

                        Originally posted by Nick Cox View Post
                        #119 #120 Making generate and replace r-class would break many commands and do-files. The point is not that the results could be ignored; they would appear regardless and zap other r-class results. Even keeping classic behaviour under version control would be widely not understood or forgotten. Unless you’re volunteering to answer all the puzzled threads arising from such a change this isn’t an attractive idea.

                        What would seem defensible to me would be new commands that were r-class. Then users who wanted them would have different behaviour.
                        This would also be much appreciated and a perfectly fine option from my perspective.

                        Comment


                        • #123 #131 Optionally pushing named locals to the caller space would be fine by me too.

                          Comment


                          • A simple UI improvement: I would love to see tabbed filename completion after do, use, using, and ls(and maybe other relevant commands/contexts).

                            I know that in other contexts, tab completion is for variable names. But you'll never have a varlist after ls or do, and rarely after use (unless you're re-loading a subset of variables for the dataset already in memory).

                            Tab completion of commands would also be pretty cool.

                            This may be harder to work with existing syntax, but my dream would be unixlike command line shortcuts e.g. !$ to repeat the last argument from the previous command

                            Comment


                            • word clouds figure!

                              Comment


                              • Originally posted by Oscar Ozfidan View Post
                                One of my wishes is to have the ability to append non dta files like xlsx or csv without having to save it as dta file first. This functionality could be restricted to files that has the same variable list initially later be expanded. I really dont understand why an xlsx file needs to be imported and saved as dta to be able to append it. Perhaps the import command may be modified to bypass that step.
                                Oscar Ozfidan I still need to do a bit more testing, but I've actually developed something specifically to address the specific need that you mentioned.

                                Comment

                                Working...
                                X