Wishlist for Stata 15

Nick Cox

Join Date: Mar 2014

Posts: 34720
#91

17 May 2016, 17:02

Aaditya, William: I see. I wonder how that could be done without complicating the language unduly. There could then be endless discussions about which statistics to support.

I'd like to see user-defined functions. That could serve the same ends without necessarily over-complicating the language.
1 like
Comment
Aaditya Dar

Join Date: Sep 2014

Posts: 112
#92

29 May 2016, 20:50

Another suggestion: if possible, please relax/increase the limit on number of stored estimation results (it's currently 300).
1 like
Comment
Ariel Karlinsky

Join Date: Jun 2015

Posts: 490
#93

30 May 2016, 01:02

Hear hear! the stored estimates limit is small and seems arbitrary
Comment
Cyrus Levy

Join Date: Nov 2014

Posts: 99
#94

02 Jun 2016, 11:28

I am not sure if what I'll suggest was requested before, but I searched for it and couldn't find anything about it. I generally use multiple instances of Stata at the same time and each has their Do-File Editors opened, and sometimes I open more than one Do-File Editor from a single Stata instance instead of using tabs. When there are many instances of Stata and Do-File Editors around, sometimes it gets really hard to track which Do-File Editor runs in which Stata window. Each Stata instance is numbered as "2 - Stata" and "3 - Stata." I wish the Do-File Editors were numbered too with respect to which Stata instance they are attached to. I use Stata 13.1. Maybe this is a feature that Stata 14 has, I don't know. Opening multiple Data Editors cause the same complication and Windows Task bar doesn't order them according to what is associated with what. If you have 3 Stata instances running, and open the Data Editor from the first Stata instance, the window of the Data Editor gets sorted to the right of the third Stata instance. Having 10-20 Stata related windows at the same time complicate things a lot... Maybe it's only me who encounter this, but I think it would be a nice and simple addition to number the Do-File and Data Editors according to their parent Stata windows.
4 likes
Comment
Michael Anbar

Join Date: Aug 2014

Posts: 116
#95

02 Jun 2016, 17:20

This relates to the performance issues I've raised with Stata before (in this thread), but to summarize, I would love for more functions like -reshape-, etc. to be faster, even if that means implementing them in C and bundling them with the Stata binary instead of as separate .ado files.

Furthermore, are functions like -tab-, -reshape-, -merge-, -sort-, -summarize-, etc. parallelized? If not, this would be a MAJOR improvement. An option to force -preserve- to write to RAM instead of writing to disk would be useful and MUCH faster than saving an entire dataset to disk.

This came up in another thread, but a way to read data types and other information about a data set that's on disk (that is, an easily machine readable version of describe that wouldn't force me to load the dataset). As of now, "describe using ..." outputs this information in text format, and some of the information, like the sort list, is available as a return class, but not all of it. Having to clobber or preserve the data in memory, load a dataset from disk, read the types, and reload the first dataset, is a pain and a massive performance hit for the massive datasets I work with. This is the thread: http://www.statalist.org/forums/foru...ry#post1343780
3 likes
Comment
Michael Anbar

Join Date: Aug 2014

Posts: 116
#96

02 Jun 2016, 17:21

Originally posted by Cyrus Levy;n1343734Maybe it's only me who encounter this, but I think it would be a nice and simple addition to [U

number the Do-File and Data Editors according to their parent Stata windows[/U].

This isn't just you, and I'll second the need for this. This would be an extremely useful feature to have.
2 likes
Comment
Cosimo Magazzino

Join Date: Jun 2015

Posts: 6
#97

28 Jun 2016, 12:44

My wish:

For Time Series Analysis: Toda and Yamamoto causality test
For Panel Data Analysis: Lee and Strazicich unit root test, Pedroni cointegration test, Kao cointegration test, Fully Modified OLS regressions, Dynamic OLS regressions, error correction based cointegration test by Gengenbach, Urbain and Westerlund (2009), Pesaran, Smith and Yamagata (2009) panel unit root test.
1 like
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1361
#98

28 Jun 2016, 14:41

Nick Cox I think user defined functions would be an awesome addition to things. It might require some changes in the underlying parser, but it wouldn't require significant modifications of the API for end-users and could add a ton of flexibility to things. Michael Anbar storing the data in RAM could also create some nasty performance penalties when working with larger data sets or when users need to do more computationally demanding operations on the data. In those cases, you run the risk of swapping memory which would be a massive time suck. I haven't had too many issues with preserve myself, but do agree that some of the other existing commands could potentially be sped up a bit if implemented in a lower-level language instead of .ado.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29438
#99

29 Jun 2016, 16:54

I wish that -simulate- and -bootstrap- offered more control over the use of dots. As it stands you either get them after each rep or you don't get any. When you're running 10,000 or 50,000 replications of something time-consuming you really don't want that many dots in your log or on your Results window, but with -nodots- you have no way of seeing if you're making progress or hung. It would be nice to be able to specify something like -dots(#)- to get a dot after each # reps.
7 likes
Comment
Richard Stanley

Join Date: Jul 2014

Posts: 6
#100

01 Aug 2016, 18:25

I wish that Stata had integration with Apache Spark, the big data processing engine that has become quite popular in data science. Spark makes it possible to analyze data that doesn't fit on one personal computer. There could be a Stata API integration with the open source Spark API that would let users write do-files but execute them on a Spark cluster. It would also be great if the limited set of Spark modules could be enhanced by contributions from Stata Corp developers.

Everyone would win in this scenario. Stata would become an interface to Spark while retaining its leadership in statistical analysis and Spark would enjoy greater power. Just my $0.02.
1 like
Comment
Jesse Wursten

Join Date: Jan 2016

Posts: 915
#101

10 Aug 2016, 08:47

It would be great if the following two features were added to the dofile editor

1. "save a copy"
This is great if you want to try something new in a complicated dofile, where reverting back to the old version might be non-trivial. Of course, you can use save as, but this is cumbersome as you'd have to save as, close, open old file again. This functionality is present in e.g. Matlab
2. Ctrl+W or middle mouse button to close a dofile. Much of the Stata hotkey functionality is similar to the internet browser experience. Ctrl W/MMB unfortunately doesn't seem to work at the moment.

Also, one typo that has been around for ages:
In find and replace, it says "find what" and "replace what". AFAIK this makes no sense in English and should be "find what" and "replace with".
1 like
Comment
Ariel Karlinsky

Join Date: Jun 2015

Posts: 490
#102

10 Aug 2016, 10:35

Since Jesse mentioned the do editor, I'll reiterate the wish for vertical selection in the do editor
1 like
Comment
Jack Gibson

Join Date: Aug 2016

Posts: 10
#103

12 Aug 2016, 10:33

Following on from previous comments about -merge- and related commands, something that would make life a lot easier would be the ability to specify different names for the key variables in the master and using files.

e.g. being able to enter something like...

Code:

merge 1:1 masteridvar=usingidvar masterkey2=usingkey2 using usingfile, keepnamesfrom(master)

...instead of having to keep renaming the variables.

Another welcome change would be modifying the m:m merge to do something less perplexing. Everyone seems to wrongly assume that it does a crossed join (joinby) unless/until they read the help file, so perhaps it should? (Or at least output a warning making it clear that it doesn't).
5 likes
Comment
Ariel Karlinsky

Join Date: Jun 2015

Posts: 490
#104

12 Aug 2016, 11:55

I second that! sometimes different datasets have the exact same variable denoted a bit differently, for example in one dataset it can be "state" and in the other "state_code" while both contain 2-digit FIPS state code. specifying that the contents of the variables is to be matched but not necessarily the name sounds like a very reasonable option to add the command. The default behavior can still be retained of course.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29438
#105

12 Aug 2016, 12:10

I endorse Jack Gibson's request to let the -merge- command (optionally) specify different names for the merge key variables in the two data sets.

But I don't agree about m:m. I think -merge m:m- should be eliminated, plain and simple. In 22 years of using Stata, I have only once seen an example where what -merge m:m- does is actually what is wanted, and in that instance, the same result could have been easily accomplished with -merge 1:1 _n- after a little bit of data massaging. Maybe others have actually found -merge m:m- useful, and, if so, they should speak up and enlighten those of us who think it's just a dangerous trap for unsuspecting users.

I don't think that -merge m:m- should be turned into a synonym for -joinby-: we already have a separate command for that, and although there are conceptual similarities between their functions, I personally find it helpful to think of them as rather distinct operations.
2 likes
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment