Wishlist for Stata 17

Nick Cox

Join Date: Mar 2014

Posts: 34925
#481

28 Mar 2021, 06:26

#480 On statsby

Code:

label data "`e(cmd)'"

puts the command string into the metadata; where would you like it to go?

Last edited by sladmin; 29 Mar 2021, 09:05. Reason: close CODE tag
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1361
#482

28 Mar 2021, 07:22

I’d like to be able to put it into a variable (as well as other information). If it was possible to pass a string as an expression (which is legal based on the help file), it wouldn’t be difficult to add a variable:

Code:

statsby model=“my first model” _b _se...

it is possible do do that with post, but post imposes a limit on the size of string variables which then makes it more difficult to store other information useful for simulations (e.g., pseudorandom number generator state, etc...).
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#483

28 Mar 2021, 08:00

... post imposes a limit on the size of string variables ...

In Stata 16.1 from the output of help post we see

Note that newvarlist does not allow strL as the variable storage type. A similar utility that allows strL as a variable storage type is [P] frame post.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1361
#484

29 Mar 2021, 07:03

William Lisowski
The challenge with that is that you can’t yet append frames without writing them to disk as separate files (and I work with others who may not have Stata 16 yet). It isn’t clear why post isn’t able to allow strL types; in the worst case scenario, it could write all strings as strLs under the hood and then on post close it could try to optimize the storage either by collating the strLs or recasting them as a string with X characters.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29587
#485

29 Mar 2021, 12:12

Re #484. There is no official Stata command for appending frames, but Daniel Fernandes and Roger Newsom have written -frameappend- to do that; it is available from SSC.

The problem of working with people who have not yet upgraded to version 16 is a substantial one; I find myself in the same position at times. While it would be convenient for the old -post- to support strLs, I suspect that StataCorp will not want to invest time into implementing that. It is clear that frames are poised to largely replace -postfile-s, and in due time, all active Stata users will catch up to version 16 or beyond. In fact, even if they were to implement this as of version 17 (which is what this wishlist is about) anyone with access to that would have access to frames.
2 likes
Comment
Mead Over

Join Date: Sep 2014

Posts: 110
#486

29 Mar 2021, 16:35

FernandoRios , in response to your suggestion in post #470 here,

[C]ould [it] also be possible to add the option "legend" for graph combine? (so it is easy to add a single legend to figures)

I wonder what the syntax might look like for a legend option on gr combine. For complete flexibility, the syntax should be able to pull any key from any of the component panels for use in the combined legend. Perhaps the order sub-command:

Code:

order(1 “Miles per gallon" 2 “Length in inches” 3 “Price in dollars”)

could be generalized to refer to specific keys in specific sub-graphs like this:

Code:

order(1.1 “Miles per gallon" 2.1 “Length in inches” 3.1 “Price in dollars”)

where 1.1, 2.1 and 3.1 refer to the first keys respectively in sub-graphs 1, 2 and 3.

I’m not sure this would be a good idea. In general, I think a multi-panel graph should only be used when it makes a point that could not be made in a single graph, and then the differences across the panels should be as small as possible. A virtue of using graph…, by(varname) to construct a multi-panel graph is the guarantee that the x- and y-axes are identical across all sub-graphs. When a user needs a multi-panel layout not possible with graph…, by(varname), the objective of maximizing readability suggests combining sub-graphs which all contain the same legend keys.

A premise of the utilities grc1leg and grc1leg2 is that the legend from one of the component sub-graphs contains all the keys necessary for a legend to the combined graph. However, having spent far too much time reverse engineering Vince Wiggins (StataCorp)' grc1leg, I was curious if my generalization of his program, grc1leg2 could be tweaked to combine keys from different subgraphs. Now, in version 1.42 of grc1leg2, I think I’ve found a way to do that with a newly added option hidelegendfrom. The "trick" I use for making a combined legend for a multi-panel combined graph having K panels is to make a K + 1st graph containing all the keys and then use its legend for the combined graph, while hiding the K + 1st graph from view. (Has someone already posted this "trick"?)

Code:

sysuse auto, clear set graph off twoway /// (scatter mpg weight, mcolor(blue)), /// name(panel1, replace) twoway /// (scatter length weight, mcolor(red)), /// name(panel2, replace) twoway /// (scatter price weight, mcolor(green)) /// (lfit price weight, lcolor(green)), /// name(panel3, replace) twoway /// This is the dummy graph from which we take the legend (scatter mpg weight, mcolor(blue)) /// (scatter length weight, mcolor(red)) /// (scatter price weight, mcolor(green)) /// (lfit price weight, lcolor(green)), /// name(panel4, replace) set graph on grc1leg2 panel1 panel2 panel3 panel4, /// title("Assemble the legend keys from different panels" "to construct the combined legend") /// xtob1title legendfrom(panel4) hidelegendfrom /// pos(4) ring(0) lxoffset(-10) lyoffset(15) /// name(grc1hide, replace)

Of course, one could achieve the same effect with Stata's gr combine by appropriately modifying the legend commands on the component sub-graphs and then using Stata's graph editor to hide panel4. But perhaps some users will appreciate using grc1leg2 to save some time and effort. Since these combined graphs always need to be tweaked, using grc1leg2's dialog to tweak the legend organization, position and offset options (and to apply common titles as appropriate) can be a particular time-saver. A Stata corporation legend option for gr combine, with their accompanying dialog, would make this discussion moot. But the syntax diagram might be unwieldy.

Last edited by Mead Over; 29 Mar 2021, 16:45.
2 likes
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2372
#487

29 Mar 2021, 20:17

Hi Mead
Thank you for that trick!
My thoughts about legend were a bit more simplistic.
1. I would keep legend using the same syntax as it has right now. The biggest difference would be to count all the subelements of a figure the same way that they are currently counted when using "twoway"
For example, say I have 2 figures:
scatter y1 y2 x, name(m1)
scatter z1 z2 x, name(m2)
I could combine them as:
graph combine m1 m2, legend(order(1 "y1" 2 "y2" 3 "z1" 4 "z2"))

Right now, for example this is how it works with two way:
two scatter y1 y2 x || scatter z1 z2 x, legend(order(1 "y1" 2 "y2" 3 "z1" 4 "z2"))

In the second case, of course, all dots would be on the same figure.

2. I actually managed to do something like this using a 3rd plot, that also only contains the legend, but that could be set nicely on the bottom of the figures.

webuse iris

twoway scatter seplen sepwid if iris==1 || scatter seplen sepwid if iris==2 || scatter seplen sepwid if iris==3 , legend(off) name(m1, replace)
twoway scatter petlen petwid if iris==1 || scatter petlen petwid if iris==2 || scatter petlen petwid if iris==3, legend(off) name(m2, replace)

twoway scatter petlen petwid if iris==0 || scatter petlen petwid if iris==0 || scatter petlen petwid if iris==0 , ///
xlabel(,noticks nolabel) ylabel(,noticks nolabel) plotregion(lstyle(none)) ///
legend(order(1 "setosa" 2 "versicolor" 3 " virginica") symysize(50)) name(m3, replace) ///
graphregion(margin(none)) plotregion(margin(zero)) fysize(12) xtitle("") ytitle("")
graph combine m1 m2 m3, col(1) iscale(.9) scale(.8) ysize(14) xsize(12)
Comment
Marc Kaulisch

Join Date: Jan 2016

Posts: 177
#488

30 Mar 2021, 02:34

Also a graph related feature wish. If a graph scheme sets an outline width (e.g. linewidth pbar) the width should be calculated proportionally to the bar height. At the moment it appears that the outline width is applied globally. In my case the outline was higher than the bar (see https://www.statalist.org/forums/for...29#post1599529). Thus I have to adjust the linewidth to none manually (or to have a second version of a scheme with linewidth pbar none), but when someone is not aware of this problem the bar height in a graph look disproportionaly.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1361
#489

30 Mar 2021, 07:29

Clyde Schechter
I’ll definitely take a look at it, but it seems like it would still create a noticeable bottleneck to combine the results over a larger and larger number of replications and/or a large number of replications with a large number of variables/estimates to combine. If nothing else, I think the performance gains of supporting some SQL based operations for tasks like this could be useful/helpful (particularly since SQL engines have already solved this problem effectively).
Comment
John Mullahy

Join Date: Dec 2016

Posts: 722
#490

01 Apr 2021, 17:20

This seems so obvious that I've quite possibly missed something equivalent that already exists, but an immediate version of –twoway histogram– would often be great to have. For example:

Code:

twoway histi 0 .4 1 .3 2 .2 3 .1, [standard twoway hist options]

would produce bars of heights .4, .3, .2, .1 at x-values 0, 1, 2, 3.
Comment
Mark Davis

Join Date: May 2014

Posts: 46
#491

01 Apr 2021, 21:39

I thought of a few things recently I would love to see. Sorry if these have already been requested.
I am hoping for stata to be able to use the apple silicone's "neural network", AKA matrix math coprocessor, to speed up certain commands. I think there could be massive gains in MATA and commands that use MATA. I've taken a look at the current implementation of the neural net and the apple libraries allow you to send multiple linear regressions only. It doesn't look too open, but there is a lot of potential.

I also think it would be wise to create some sort of Mac OS version of the server software. Stata runs so much faster on apple silicone than x86 architecture and the apple chips have much better single processor performance than typical server chips whcih lower clock speeds for thermal reasons on high core count chips. Apple has some 8-32 performance core versions of their chips in development. I could see institutions with servers wanting to transition to apple silicone to get 33-100% better server performance per core. The rumor is apple is developing some sort of mac pro mini workstation.

lastly being able to load large CSVs in parallel. I feel like there must be an effective way of having multiple cores load a CSV file in parallel. It kills me when my eight core license is waiting for an hour while one core loads in the data. I'm sure the development team has thought of this before and there is a good reason, but parallel CSV reading for me would be a $100 feature for me. I think a lot of researchers using large healthcare claims data would appreciate this as well.

Please Stata Corp, don't make stata 17 great. I just bought a Stata 16 MP8 license to be able to use apple silicone and and I can't afford for you to come out with any must have features.

Last edited by Mark Davis; 01 Apr 2021, 22:21.

Owner of S tataTutor.com
2 likes
Comment
John Mullahy

Join Date: Dec 2016

Posts: 722
#492

02 Apr 2021, 07:06

Re my posting #491: Using

Code:

twoway scatteri .4 0 .3 1 .2 2 .1 3, recast(bar) ysc(r(0 .4))

delivers pretty much what's desired. That said, a dedicated –twoway histi– command would still be nice to have.
2 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 34925
#493

02 Apr 2021, 08:17

#491, #493 John Mullahy Please post that also as a separate thread. (The duplication is I think justified here, as a suggestion for Stata 17 is also of interest to many people regardless of whether Stata 17 turns out to include it or they upgrade to Stata 17.)

I have other suggestions that don't depend on what StataCorp does.
3 likes
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 395
#494

02 Apr 2021, 14:57

Originally posted by Mark Davis View Post

Please Stata Corp, don't make stata 17 great. I just bought a Stata 16 MP8 license to be able to use apple silicone and and I can't afford for you to come out with any must have features.

Back in 2017 when they released Stata 15, I had purchased Stata 14 a couple of months prior. After they announced Stata 15 I got an email from the sales team saying that because I had recently purchased 14, I was eligible for a free update to 15. I was blown away and happy to accept that upgrade. This was when you paid more money up front to get the full version, which is my preferred option. I don't know if that is still their practice as they seem to want everyone to go to a subscription model, but there's at least hope!
Comment
Nicholas Winter

Join Date: Mar 2014

Posts: 122
#495

02 Apr 2021, 16:30

Add the export excel option firstrow(variables|varlabels) to export delimited, so the first row of the exported file has the variable labels rather than the variable names. As it stands I'm using this ridiculous work-around: https://www.statalist.org/forums/for...01#post1537701

Seems like this should be relatively easy to implement....
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment