Wishlist for Stata 17

Clyde Schechter

Join Date: Apr 2014

Posts: 29963
#436

31 Jan 2021, 15:23

You can scroll through commands in the Stata command window using the PageUp and PageDown keys (at least in Windows--I would bet there's something like that available for Mac and Unix, too.)
1 like
Comment
Justin Niakamal

Join Date: Aug 2017

Posts: 757
#437

31 Jan 2021, 20:06

I would like to see some form of three-dimensional plots introduced into official Stata. Personally, I find there are limitations to the existing community-contributed three-dimensional plots. I know this can be accomplished using the python integration (as shown in the Statablog), but would love to able to use marginsplot as illustrated below

Code:

webuse "nhanes2.dta", clear logit highbp weight c.age##c.age margins, at(age=(20(5)80) weight=(40(5)180)) marginsplot, xdim(weight) ydim(age) zdim(highbp)

The example above makes use of surface (SSC), which is great but is limited in color capabilities. Ideally would like the coloring to mimic the behavior in Jen Jann's heatplot.

Last edited by Justin Niakamal; 31 Jan 2021, 20:09.
2 likes
Comment
Christopher Bratt

Join Date: May 2019

Posts: 144
#438

06 Feb 2021, 09:31

Justin Blasongame , #437. If you rely on advances graphics, I would recommend considering to call R or Python from within Stata. Stata now integrates with Python and it is easy to run R from within Stata (by using the rsource package for Stata). I don't think you can expect Stata to develop as advanced plotting functions as those found in more specialised programs/packages, and specifically R should give you more advanced plotting than what you will get in Stata.
Comment
Justin Niakamal

Join Date: Aug 2017

Posts: 757
#439

06 Feb 2021, 10:52

Re #428, I respectfully disagree. Right on Stata’s front page, you’re greeted with both “Your data tell a story” along with “Visualize”. In my view, asking for enhancements to Stata’s graphics capabilities are a perfectly reasonable request given Stata markets itself as statistical software for data science, and visualization is generally an important step in data analysis. Others on this thread have expressed similar sentiments – see #402). Yes, I have used both rsource and the Python integration, but if the argument is X functionality exists in R and/or Python, then why bother to use or learn Stata if you'll just direct someone to another software when improvements or enhancements are requested?
1 like
Comment
Christopher Bratt

Join Date: May 2019

Posts: 144
#440

07 Feb 2021, 08:13

I was just trying to be of help.

if the argument is X functionality exists in R and/or Python, then why bother to use or learn Stata if you'll just direct someone to another software when improvements or enhancements are requested

Stata has its strengths, other programs have their strengths. Data scientists will probably benefit from using more than one software/language. Stata now integrates with Python. RStudio makes a huge effort to integrate R and Python.

Of course, not everyone wants to invest the time to learn several languages, and that decision can in some cases be wise. But it's normally for the benefit of the researcher to use more than one application/language. I am sorry if this view come out as offensive, it was certainly not meant to be.
1 like
Comment
Justin Niakamal

Join Date: Aug 2017

Posts: 757
#441

07 Feb 2021, 13:28

Hi Christopher Bratt

No offense was taken, I apologize if my response came off snooty. I’ll clarify my position. I would like Stata to put some focus on two issues a) speed and b) graphics. I won’t go into detail on speed, as others have already expressed their interest in improvements to some of Stata’s commands. I do think Stata needs to put a little more focus on graphics, especially since all Stata 16 added in terms of graphics was point sizes. Current and former collogues of mine have balked at Stata’s graphics and typically describe them as "ugly". You can also find critiques on social media regarding Stata’s default color scheme. Most of the experienced posters here don’t even rely on Stata’s default scheme and instead use s1color (at the very least, I think StataCorp should set this as the default if they choose to not update their look). I used Stata for a while before I even realized schemes were a thing, and I’m sure a lot of people that use Stata casually aren’t aware of the flexibility of Stata’s graphics system. There are a lot of obscure statistical routines that I would personally like added, but I think it makes sense to use rsource, rcall, or the python integration where those are available since it might be asking too much for Stata to add something that might be too niche to a particular field. I think graphics are a little bit more universal. Put another way, someone oriented in economics might not use the same suite of commands as someone in say, biostatistics, but they’ll likely make use of Stata’s graphics. I’m not a fan of the new(er)-age data visualizations like donut plots or radar charts (I still don’t know how to interpret those), but I think 3-dimensional plots would be a welcome addition, given that the community-contributed 3-dimensional charts are in my view, limited.

Justin
2 likes
Comment
Christopher Bratt

Join Date: May 2019

Posts: 144
#442

08 Feb 2021, 02:19

Justin Blasongame, my position is that Stata marvels in using human-readable and easy-to-remember code (in simple operations, at least). I would love to use Stata, but things that are important to me are missing in Stata (speed in SEM/GSEM, reproducible research that matches that of R Markdown, results from analyses that are easy to manipulate for graphs or tables). I mentioned that in an earlier post here in this thread. So personally, I hardly use Stata these days, but I still admire the elegance of the Stata language.

I would still encourage anyone to try Python or R for graphics, and I encourage anyone not allergic to programming to learn more than one language. Even the R people recommend to be open towards Python. Specifically now that StataCorp has made the decision to builde bridges to Python, acknowledging that StataCorp cannot offer everything, I would start combining Stata with Python.

Best wishes.
Christopher
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#443

08 Feb 2021, 14:40

This builds on my post #401, where I proposed adding a HAC option to

Code:

regress

so that the very common Stata command used for OLS is still used for time series regression. Based on questions I've seen on the Forum, it would be very handy to have HAC options for many more commands. Fortunately, this is allowed for

Code:

glm

, and this covers several cases of interest, but it doesn't cover all cases. For example, it is perfectly fine to use a Tobit model or a two-part model for count outcomes with time series data. But one needs to adjust the standard errors for serial correlation, in general. It is well-known how to do this and I assume

Code:

glm

uses HAC applied to the score of the objective function.

As an example, the following Stata commands would be useful:

Code:

tsset t tobit y x1 ... xk, ll(0) vce(hac nw 4)

and more generally

Code:

tsset t tobit y x1 ... xk, ll(0) vce(hac kernel #)

I think the absence of such an option makes researchers think that such models and MLE cannot be applied to time series data whereas all estimation methods can be under the appropriate weak dependence assumptions. Of course, it is not "full" MLE if we ignore the dependence, but it is MLE conditional on the x(t) we have chosen.

As I said, applying the HAC formulas to the score of the log-likelihood or quasi-log likelihood is fairly easy. My 1994 Handbook of Econometrics chapter, "Estimation and Inference for Dependent Processes," covers the general case. There are likely other sources.
4 likes
Comment
Brian Poi

Join Date: Feb 2021

Posts: 22
#444

09 Feb 2021, 08:57

If I could have one thing in the next version of Stata, it would be faster graphics. Starting from scheme s1color I can make a graph look how I want it easy enough through trial an error. What can be frustrating, though, is how long it takes anything but the simplest graph to render. Moreover, once I have one graph, chances are I'll want to see it for, say, all 50 states in separate tabs or as a by-graph. That's when the rendering speed really becomes a drag. In contrast, when I am forced to go to the dark side and use R, I use the base graphics package; graphs appear instantly there. And I really don't like the idea of having to use Python or R inside of Stata; I'd rather use the Stata language I know and love.
5 likes
Comment
Matthew Lala

Join Date: Aug 2018

Posts: 34
#445

09 Feb 2021, 14:09

I would like better debugging/error messages for the syntax command when writing user-written programs. Consider

Code:

cap program drop debugme1 program define debugme1 syntax [, opt1(string) opt2(integer)] di "The program worked." end cap program drop debugme2 program define debugme2 syntax [, opt1(string "test") opt2(integer 1)] di "The program worked." end

Those familiar with the syntax command will know that neither debugme1 nor debugme2 will work. This is because one is required to specify a default value for optional integer arguments, but is not allowed to do the same for optional string arguments. Perhaps because I do understand the logic of this different treatment of integer and string inputs, I often forget this rule when I have been away from Stata for a while. I then tear my hair out when in both cases Stata simply returns the uninformative: invalid syntax, when you think it would be quite easy to note that the error is with opt1 or opt2 (as it does with compulsory arguments), and perhaps even indicate the nature of the error!

This is but two examples of uninformative invalid syntax errors which come to mind at the moment; I think the error messages for syntax can in general be very improved.
2 likes
Comment
JanDitzen

Join Date: Jan 2015

Posts: 348
#446

11 Feb 2021, 00:03

I support the idea of having a default value for string. Very often I have additional lines in my code which set a local with a default value for a string.

One could argue that having a default value for a string option is bad programming or can prevented because following lines will have "if "`opt1'" == "test"" instead of "if "`opt1'" == "" ". However I can name several examples in which a default option makes subsequent coding easier. This is in particular the case if mata code is called.
Comment
John Mullahy

Join Date: Dec 2016

Posts: 743
#447

23 Feb 2021, 11:14

I tweeted a thread earlier today about problems that can arise with dummy RHS variables in Poisson regression and the fact that Stata produces an apparently converged estimate when it should not be able to produce an estimate. (Note: The problem is more general than Poisson regression.)
https://twitter.com/JohnMullahy/stat...72211692036096

I'm wondering whether v17 might consider building in errors (or at least warnings) akin those that exist for probit and logit

Code:

outcome = x2 <= 0 predicts data perfectly r(2000);
6 likes
Comment
Dirk Enzmann

Join Date: Apr 2014

Posts: 524
#448

23 Feb 2021, 14:21

John Mullahy : I think it is not nice to refer to twitter or any other social media posts as there may be readers of the Stata Forum who do not (want to) use some of these social media channels.
1 like
Comment
John Mullahy

Join Date: Dec 2016

Posts: 743
#449

23 Feb 2021, 14:46

Re: #448: Here is a link to the twitter thread referred to in #447 that does not require that you enter the twitter environment.

https://threadreaderapp.com/thread/1...692036096.html
2 likes
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3001
#450

24 Feb 2021, 06:10

Originally posted by John Mullahy View Post

I tweeted a thread earlier today about problems that can arise with dummy RHS variables in Poisson regression and the fact that Stata produces an apparently converged estimate when it should not be able to produce an estimate. (Note: The problem is more general than Poisson regression.)
https://twitter.com/JohnMullahy/stat...72211692036096

I'm wondering whether v17 might consider building in errors (or at least warnings) akin those that exist for probit and logit

Code:

outcome = x2 <= 0 predicts data perfectly r(2000);

I could not agree more; I was hoping this article would lead to that, but it has been 10 years... The excellent user-contributed ppmlhdfe command fixes all those problems and it could essentially replace the very unreliable poisson command.
4 likes
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment