Wishlist for Stata 19

Gregory Grant

Join Date: May 2016

Posts: 20
#196

18 Jan 2024, 09:46

I updated to version 18 a month ago and I like it but there are a few basic things I'm kind of shocked aren't already implemented.
All of these requests are absolutely basic.
If any of these are implemented already, please let me know how to do them. Thank you!
Here's my short list:

1) When importing a spreadsheet:
- cannot ask for it to be transposed
- cannot specify rows to be variable labels or variable notes as opposed to being data
- in other words multiple columns of variable annotations
- I'm trying to import gene expression data with 30K genes and what am I supposed to do, enter 30K labels by hand?

2) When coloring by a variable, it's always treated as numeric, if it's categorical then why can't we have the legend use value labels?

3) When editing using the data editor, there's no undo. In fact there's no undo generally in Stata. That seems like it would have been an oversight in Stata 1, how did we get to 18 without it?
2 likes
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 810
#197

18 Jan 2024, 09:58

what am I supposed to do, enter 30K labels by hand?

Gregory Grant You can still do this (and the transpose) programmatically, though you need more steps than the import command. If you make a new thread and ask this as a question with a -dataex- example, I bet you'd get code quickly. Or you can search the forums since this has been asked about before.
2 likes
Comment
Gregory Grant

Join Date: May 2016

Posts: 20
#198

18 Jan 2024, 10:25

Originally posted by Daniel Schaefer View Post

Gregory Grant You can still do this (and the transpose) programmatically, though you need more steps than the import command. If you make a new thread and ask this as a question with a -dataex- example, I bet you'd get code quickly. Or you can search the forums since this has been asked about before.

Thank you for this quick reply, I will post the questions directly. But why on earth would they make something this basic that hard to do? It doesn't make sense to me.
1 like
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1119
#199

18 Jan 2024, 10:29

My request concerns the examples in the windows that pop up when one types help command. Take help regress as an example. Currently, the first part of the examples section looks like this:

Setup
. sysuse auto

Fit a linear regression
. regress mpg weight foreign

Fit a better linear regression, from a physics standpoint
. gen gp100m = 100/mpg
. regress gp100m weight foreign

Obtain beta coefficients without refitting model
. regress, beta

Suppress intercept term
. regress weight length, noconstant

Model already has constant
. regress weight length bn.foreign, hascons

I wish it looked like this so that users could copy, paste, and execute the code with no need to edit anything first:

Code:

// Setup sysuse auto, clear // Fit a linear regression regress mpg weight foreign // Fit a better linear regression, from a physics standpoint generate gp100m = 100/mpg regress gp100m weight foreign // Obtain beta coefficients without refitting model regress, beta // Suppress intercept term regress weight length, noconstant // Model already has constant regress weight length bn.foreign, hascons

Notice too that I added a clear option to the use command and replaced gen with generate. Although gen is probably clear enough, I think it is good practice to spell out commands fully for the benefit of Stata newbies at least. E.g., I remember one occasion where I had a heck of a time figuring out that di meant display. YMMV, of course.

Thanks for considering.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
3 likes
Comment
Henry Strawforrd

Join Date: Sep 2021

Posts: 228
#200

19 Jan 2024, 02:42

Would be great to have a Polars integration in Stata as they did for Python and R. Stata is a pain with large datasets. Gtools helps but has only few commands.

Last edited by Henry Strawforrd; 19 Jan 2024, 02:56.
Comment
Sonnen Blume

Join Date: Aug 2018

Posts: 342
#201

20 Jan 2024, 07:37

I am a big fan of the bulk selection methods in Stata e.g.

Code:

tab *

Code:

keep s*

But this doesn't work when I want to tab variables whose names start/end/contain a certain character. My dream is a wild one that doesn't work:

Code:

tab ^a*

This applies to other types of analyses and wrangling and cleaning names/labels as well. I have to switch to R to do the regex processings (the stringr and stringi packages are simply euphoric). And it'd be euphoric to have such features in Stata 19.

Last edited by Sonnen Blume; 20 Jan 2024, 07:44.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29954
#202

20 Jan 2024, 09:09

I don't understand what is being requested in #201. It seems that what is a fact about the -tab- command is being confused with the workings of wildcards in Stata varlists.

-tab- takes either one or two variables in its varlist, that is all. -tab a*- will, in fact provide a cross tab of the two variables in the data set whose names start with a, if there are exacatly two of them. If there is only one, you get a tabulation of that. If there are more than 2, then the command will fail because -tab- only allows two variables.

If what is wanted is to be able to get a series of tabulations of the single variables all of which begin with a, that can easily be done, but with -tab1-, not -tab-.

Code:

tab1 a*

will do that. To get a series of tabulations of the single variables all of which contain a somewhere, that's -tab1 *a*-. And for variables ending in a, -tab1 *a- does it.

No need for regular expressions or resort to Python or other software to do this kind of thing.
5 likes
Comment
Sergiy Radyakin

Join Date: Apr 2014

Posts: 1867
#203

24 Jan 2024, 15:57

A simple request for a convenience feature - to be able to see line numbers in the viewer. Perhaps, as a toggle somewhere in the menu. Sometimes need a quick tool to see a file online:

Code:

view "http://somesite.com/data.csv"

Currently have to download a temp copy, then open in the do-files editor (which doesn't support web files - error 632).
1 like
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2389
#204

24 Jan 2024, 16:31

Also a small request:

Have an option added to -export delimited- which allows for explicit printing of missing values to the output. The only behaviour right now (for system missing) is to print nothing, such as two consecutive delimeters. Existing precedent for this exists in the older -outfile- which will print a period for system missing values. A nice extension of this behaviour would be to allow a user specified value for those missing values (e.g., -999).

This is useful for the minority of users that need to export datasets for use in programs like MPlus, which require explicit values/missing characters for every value.
3 likes
Comment
Johan Karlsson

Join Date: Jan 2020

Posts: 25
#205

25 Jan 2024, 06:24

I would really need a command that gives me the coordinates of the edges of an object within a graph. For example, if I have a linear function with an upward slope, I would want to get the coordinates that the line occupies. This would radically increase the options to customize graphs.
Comment
Anthony Killeen

Join Date: Jun 2017

Posts: 13
#206

25 Jan 2024, 19:27

Originally posted by Ariel Linden View Post

I would very much like to see:

(1) a suite of official machine learning tools (there are several user-written commands but only lasso and npregress are official commands, and they certainly don't represent the current standard).

(2) a way to speed up mixed models (all of them). I have projects in which mixed (which is certainly the "fastest" of the bunch) has taken 12 WEEKS (yes, weeks!) to complete. That's ridiculous. And I am using an extremely expensive version of Stata for 12-cores! The slowness of multilevel models leads to bad practices. If it takes me 12 weeks to run a single model using -mixed-, I may use -mixed- for a binary outcome because I know that using -melogit- may take twice as long. I am not sure why this cannot be performed as a parallel process to speed it up, or some other mechanism...

Fingers crossed!

Ariel

You get very little or no benefit from running -mixed- on multicore processors because the command hasn't been "parallelized". See https://www.stata.com/statamp/perfor...ort/report.pdf. But your point also brings up a related topic, which is the requirement to pay additional money for the privilege of running Stata MP on 4- or 8-core machines, which are just consumer-level, basic computers today. If someone was paying extra for a software version that would maximize the power, of say, a 64-core processor (which seems fairly exotic to me), I'd understand it. But 4 or 8 cores are in very basic computers today. There shouldn't be a premium for using all of the computing capacity of a consumer level device.
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35431
#207

26 Jan 2024, 07:29

#206 raises questions on several levels. We all know that StataCorp is a business and can have views on what is and what is not reasonably priced -- and that boils down to what we want and whether we or our employers are willing to pay.

But Anthony's post is seriously and factually wrong in important details. Stata/MP for example is about very much more than supporting more processors explicitly, You are getting support for much bigger datasets and much else. Look through

Code:

help limits

to see many important differences. Coding to support those bigger datasets is immensely more challenging than just changing some constant within internal code. You're paying for what you get.
2 likes
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2389
#208

26 Jan 2024, 07:38

Also in line with #207, we can all think of some alternative system that doesn't license by core (only physical CPU) but the costs are prohibitively greater for most. I would not wish to see the change suggested if it meant that Stata becomes more expensive.
2 likes
Comment
Mario Ferri

Join Date: Jul 2019

Posts: 190
#209

27 Jan 2024, 19:22

X-13arima-seats seasonal adjustment program and a build-in program for seasonal adjustment. Can't believe it there is none!
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 810
#210

30 Jan 2024, 15:43

I know I'm just dreaming here but:

A do file editor that underlines things like syntax mistakes or unrecognized commands in red. Better if it could make syntax suggestions in a tooltip on mouseover. Trace is good, but I'd love breakpoints with more tooltips showing me the current content of a macro or scalar on mouseover. Also, it would be great if I could manipulate and inspect state with the console while stopped on a breakpoint. Oh, and line numbers.

I've been spoiled by fully featured IDEs (and MATLAB) but I honestly love these features and would be ecstatic if they were (even partially) incorporated in the do-file editor. It doesn't have to be visual studio, but what about visual studio code? (Please don't make me write my own VSC plugin).
4 likes
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment