Wishlist for Stata 18

wbuchanan

Join Date: Mar 2014

Posts: 1361
#406

17 Jun 2022, 06:28

Fahad Mirza speaking only as one of several folks who’ve done some work in that vein, I’ve not been actively investing in maintaining some of the older stuff (e.g., D3mata) purely because it would be too time consuming to keep up with changes to the JavaScript library. That said, I have thought about throwing together a wrapper around Bokeh and/or Altair, but haven’t made the time for it. If anything, I would say that would probably be the best route at this point versus trying to build something Stata native.
1 like
Comment
Michael Stepner

Join Date: Jul 2014

Posts: 56
#407

17 Jun 2022, 09:07

Request:
I'd like to see Stata 18 add a new 64 bit integer variable type.

It would also be nice to have a binary display format for string variables, akin to %16H / %16L for doubles and %8H / %8L for floats.
Perhaps %64S for base64? And/or %16S for base16/hexadecimal?

Issue:
Stata's data types (help data_types) support 32 bit integers (long). But Stata does not support variables containing 64 bit integers (which are often called bigint in other software).
This seems like a vestige from when Stata was designed for 32 bit systems.

Context for why this would be useful:
Statalist — Missing data type: 64 bit integer variable
7 likes
Comment
Truong Quoc Phan

Join Date: Mar 2019

Posts: 30
#408

20 Jun 2022, 02:52

I expect

Code:

suest

can work for

Code:

xtreg

It is kinda annoying to do the work around solution
Comment
Seyi Soremekun

Join Date: Dec 2019

Posts: 5
#409

24 Jun 2022, 10:44

reposting from separate thread:

Stata 17 suggests autocompletion options when you type a few letters in a do file.
I don't want to turn this nice feature off, however I would like it not to select the first word option in the autocomplete list when I press 'enter'. My primary use of 'enter' is to move to the next line; however when autocomplete is on the result is random unwanted words autotyped in my do files.

I have a read a few previous threads, none seem to have a solution to my problem (apologies in advance if I've missed a thread). It seems the only option is to disable autocomplete entirely, but perhaps a better compromise could be implemented in Stata18.
1 like
Comment
daniel klein

Join Date: Mar 2014

Posts: 3818
#410

04 Jul 2022, 03:46

This is (probably) a small one: Could we have a Mata function that directly accesses the dataset label? Something along the lines of st_varlabel(), st_varvaluelabel(), etc. I imagine it is rarely needed but if it is, the workaround through Stata's extended macro functions is slightly inconvenient.
1 like
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2429
#411

07 Jul 2022, 10:34

Something i have been working with, but would be really nice to have as a native command, would be to be able to draw areas where only the higher, but not lower, line is visible.
for example, I can do this with the "area" and "rarea" commands

But what I would like to get with some options in "area" or "rarea" is this:

Yes, I can construct this using area and line, but would like to have a single line that creates this if possible

(I do hope I'm not simply ignoring a currently available option)
Attached Files
3 likes
Comment
daniel klein

Join Date: Mar 2014

Posts: 3818
#412

07 Jul 2022, 15:28

net query should store results in r().

As a programmer, I want to be able to write code like

Code:

quietly net query local other "`r(other)'" net set other PLUS net get ... net set other "`other'"

Edit:

Alternatively, net settings could be stored in c().

Last edited by daniel klein; 07 Jul 2022, 15:30.
3 likes
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29936
#413

11 Jul 2022, 15:23

I suggest that the error message "too few quotes" be changed to "unmatched quotes" or something like that. Stata issues this message whenever it encounters unbalanced quotes. But it really has no way of knowing if the unmatched quote is missing a mate, or if it is a surplus quote that shouldn't be there in the first place. In my own work, I am more likely to introduce a stray quote when aiming my 5th finger for the "enter" key than I am to forget to provide, or inadvertently delete, a mate to a quote. "Unmatched quotes" would be accurate in either case. Evidently this isn't a big deal, but still.
9 likes
Comment
patricio cuaron

Join Date: Jul 2022

Posts: 6
#414

16 Jul 2022, 15:03

Originally posted by Nils Enevoldsen View Post

I'd like a native compressed DTA file format.

For my DTAs larger than a couple dozen megabytes, zstd typically compresses by about 90% in a fraction of a second at the standard level. For smaller DTAs, the typical compression I see is about 50%. (This is after gathering the low-hanging fruit of casting variables to appropriate types.) Fancier variations are possible, but it seems pretty straightforward to support a “.zdta” format that is literally just a compressed .dta.

+1 to this, and I cannot believe no one else brought it up. Even with fast nvme ssds it is a pain to work with multiple multi-GB tables, particularly when collaborating on an academic setting (ie using dropbox or something like it).

Other requests:
enable triple-click selection of whole words on the results window.

enable merges and frame links of differently named variables, as many have requested before.

adopt gtools (https://gtools.readthedocs.io/en/). the syntax for gcollapse, greshape and gquantiles is so much better (and the latter is so much faster than the egen alternative...!)

append frames.

enable multiple browser windows for different frames.

add a frame lists to the right of the results window, below the variables list. ideally it would allow the user to open it in a browser window with just a double click.

adopt/sponsor VS Code with the Stata Enhanced, stataRun and most importantly StataLanguageServer extensions. This would enable a leap in capabilities for code editing at an absolutely minimal investment for StataCorp.

keep improving performance. Very important. A lot of important commands are still slowwwww (eg. egen xtile).
2 likes
Comment
John Riveros

Join Date: Sep 2018

Posts: 43
#415

17 Jul 2022, 08:46

Regarding reg3 or Three-Stages Least Squares, it would be great if robust standard errors (Sandwich type, Driscoll-Kraay, and HAC) would be implemented, also implement one-way and two-way fixed effects for reg3 as an option. This will make much more use of the package reg3, in particular in panel data analysis.
1 like
Comment
Nihat Mugurtay

Join Date: Apr 2018

Posts: 103
#416

18 Jul 2022, 06:58

As a newcomer,

It would be great if STATA has a more flexible syntax like SQL. Especially, when it comes to logical operators, syntax flexibility provides a more user-friendly environment. For instance, "and", "AND" or "&" might be used interchangeably. Also, there may be an option for merging more than two datasets like this (it is trivial but also a brainstorming:

Code:

merge 1:1 ccode year using "data1", gen (merge123) & merge 1:1 ccode year using "data2", gen (merge 1234) & merge 1:1 ccode year using "data3", gen (merge 12345)

Each generated "_merge" element would still provide information about what is going wrong.

The third one might be related to working with different tabs. Each time I open a new STATA 17 window, and upload data. There are always at least three different STATA windows, which sometimes make me confused about saving or playing with data as I want.

Best,
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35400
#417

19 Jul 2022, 05:15

and and AND as logical operators? Please no. One of several considerations is that they are legal variable and scalar names.

Two or more merges at once? That really isn't trivial. Better to write your own loop.
1 like
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1361
#418

19 Jul 2022, 06:25

Nick Cox
While I agree with you about the semantics/syntax recommendation from Nihat Mugurtay, I think the merge suggestion may have been devoid of some additional context. I'm assuming that Nihat Mugurtay's reference to things being trivial is that this is a solved problem from the SQL world. I think having some more SQL-esque type data management/manipulation language could be useful personally, but I imagine it would likely not be terribly easy to implement using the Stata internals and would not be nearly as efficient as some SQL structures that enable SQL's efficiencies (e.g., indices, pages, etc...) don't exist in Stata.
1 like
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1116
#419

19 Jul 2022, 08:30

After reading (and responding to) this discussion from 2017, I suggest that commands like -summarize- and -tabstat- include an option for reporting "descriptive" versions of the variance and SD (i.e., with n rather than n-1) in the denominator. But the default should remain the "inferential" versions that use n-1 in the denominator.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
1 like
Comment
Nihat Mugurtay

Join Date: Apr 2018

Posts: 103
#420

19 Jul 2022, 10:59

Nick Cox, @wbuchanan,

I understood #417 on logical operators. Thank you for your comments. #418 has useful remarks, and that is why I sometimes migrate to MySQL for data management/manipulation (multiple merging, more flexible syntax, user interface).
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment