Wishlist for Stata 19

JanDitzen

Join Date: Jan 2015
Posts: 348

#256

15 Apr 2024, 05:34

The program below does the trick. It returns as a default the variable list without the indicator that the variable needs to be omitted. It cannot handle other baseline categories than "0".

Code:

program define fvexpand2 , rclass
    syntax [anything] , [VARLISTOmitted]
    fvexpand `anything'
    if "`varlistomitted'" == "" {
        if "`r(varlist)'" != "" return local varlisto "`r(varlist)'"
        local tmp = subinstr("`r(varlist)'","0b.","0.",.)
        if "`tmp'" != "" return local varlist "`tmp'"
    }
    else {
        if "`r(varlist)'" != "" return local varlist "`r(varlist)'"
    }

    if "`r(fvops)'" != "" return local fvops "`r(fvops)'"
    if "`r(tsops)'" != "" return local tsops "`r(tsops)'"
    
end

As I said, a stable Stata solution would be nice too.

Comment

daniel klein

Join Date: Mar 2014
Posts: 3823

#257

15 Apr 2024, 07:03

Here is a simpler way of what I believe you want:

Code:

program fvexpandnobase // , rclass
    
    version 18
    
    local fvbase = c(fvbase)
    
    nobreak {
        
        set fvbase off
        
        capture noisily fvexpand `0'
        
        set fvbase `fvbase'
        
    }
    
end

Example

Code:

. sysuse auto
(1978 automobile data)

. fvexpandnobase i.foreign

. return list

macros:
              r(fvops) : "true"
            r(varlist) : "0.foreign 1.foreign"

. 
end of do-file

Comment

Dirk Enzmann

Join Date: Apr 2014

Posts: 523
#258

15 Apr 2024, 07:17

To Laura Vossen Engblom :

- Simulation-based residual diagnostic plots for hierarchical (multi-level/mixed) regression models. The DHARMa R package.
- More customizable table1
- Add the do-file as a subwindow to the main stata window with the console. Mainly to skip having to click back the do-file window every time you change programs.
- Add simple examples to each command description, with code AND output.
- Give encode a replace option

I can't say something about your first point, but

Tables are extremely customizable even for oneway tables, see -help table_oneway- (and the respective reference manual entry). Admittedly, the use of -collect- is arcane (to me). Note that there also is the SSC command -fre-, perhaps it does already what you want to achieve.

Simply use the keyboard combination <ctrl><2> to jump to the results window and <ctrl><8> to jump back to the Do-File-Editor (see the keystroke alternatives next to the window names in the Windows Menu). This should work as long as your operating system is not Linux. In Linux this does not work and makes a good wishlist item for Stata 19.

It may differ from the user's perspective what "simple" is, but note that the reference manuals provide many command descriptions together with output. In case output is lacking you can copy the example syntax into your Command Window (or .do-file) -- it will run even with preceding dots.

Why not use the option -gen()- and subsequently drop the original variable with -capture drop originalvarname-?

Last edited by Dirk Enzmann; 15 Apr 2024, 07:27.
Comment

daniel klein

Join Date: Mar 2014
Posts: 3823

#259

15 Apr 2024, 09:59

Here is an improved version of the program in #257 that optionally respects fvsettings (if the user has set fvbase on):

Code:

*! version 1.0.0  15apr2024
program fvexpandbn , rclass
    
    version 16.1
    
    local c_fvbase = c(fvbase)
    
    nobreak {
        
        set fvbase off
        
        syntax [ varlist(default=none fv ts) ] [ if ] [ in ] [ , FVSET ]
        
        if ( ("`fvset'"=="fvset") & ("`c_fvbase'"=="on") ) {
            
            set fvbase on
            syntax [ varlist(default=none fv ts) ] [ if ] [ in ] [ , FVSET ]
            set fvbase off
            
        }
        
        capture noisily fvexpand `varlist' `if' `in'
        local rc = _rc
        
        set fvbase `c_fvbase'
        
        if `rc' exit `rc'
        
    }
    
    return add
    
end

Here are examples (you may add some assert commands to create a certification script):

Code:

. sysuse auto
(1978 Automobile Data)

.
. // fvset-tings
. fvset base 1 foreign

. fvset base freq rep78

.
. // default behavior; ignore fvset
. fvexpandbn i.foreign i.rep78

. return list

macros:
              r(fvops) : "true"
            r(varlist) : "0.foreign 1.foreign 1.rep78 2.rep78 3.rep78 4.rep78 5.rep78"

.
. // optionally, respect fvsetting
. fvexpandbn i.foreign i.rep78 , fvset

. return list

macros:
              r(fvops) : "true"
            r(varlist) : "0.foreign 1b.foreign 1.rep78 2.rep78 3b.rep78 4.rep78 5.rep78"

.
. // always respect explicit base-settings
. fvexpandbn ib0.foreign ib5.rep78

. return list

macros:
              r(fvops) : "true"
            r(varlist) : "0b.foreign 1.foreign 1.rep78 2.rep78 3.rep78 4.rep78 5b.rep78"

.
. // always respect explicit base-settings
. fvexpandbn ib0.foreign ib5.rep78 , fvset

. return list

macros:
              r(fvops) : "true"
            r(varlist) : "0b.foreign 1.foreign 1.rep78 2.rep78 3.rep78 4.rep78 5b.rep78"

.
end of do-file

.

Last edited by daniel klein; 15 Apr 2024, 10:02. Reason: hint to -assert- for certification

Comment

patricio cuaron

Join Date: Jul 2022

Posts: 6
#260

18 Apr 2024, 09:05

When invoked from the command box from the main UI, `qui do script.do` runs much faster than `do script.do`. The seems seems to be a small delay (~50-100ms?) between each line printed to the console. This is way too slow in 2024 and hinders productivity when writing/debugging long scripts, especially given the lack of a step-by-step execution of scripts for debugging. Eliminating this delay would speedup actual workflows greatly.
1 like
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2389
#261

18 Apr 2024, 10:29

Originally posted by patricio cuaron View Post

When invoked from the command box from the main UI, `qui do script.do` runs much faster than `do script.do`. The seems seems to be a small delay (~50-100ms?) between each line printed to the console. This is way too slow in 2024 and hinders productivity when writing/debugging long scripts, especially given the lack of a step-by-step execution of scripts for debugging. Eliminating this delay would speedup actual workflows greatly.

There will always be additional delay when having to print output to the screen compared to simply suppressing it. In any of my real work, the amount of time printing to the screen is not noticeable and immaterial compared to more time intensive routines such as fitting models, reshaping data, performing imputations or bootstraps, etc. I would strongly advise that if you have long and time-consuming scripts, perhaps it is worth considering whether they can be refactored into smaller "steps" first or otherwise made more efficient so that you don't need to rerun everything from the top.
1 like
Comment
patricio cuaron

Join Date: Jul 2022

Posts: 6
#262

18 Apr 2024, 11:03

Originally posted by Leonardo Guizzetti View Post

There will always be additional delay when having to print output to the screen compared to simply suppressing it. In any of my real work, the amount of time printing to the screen is not noticeable and immaterial compared to more time intensive routines such as fitting models, reshaping data, performing imputations or bootstraps, etc. I would strongly advise that if you have long and time-consuming scripts, perhaps it is worth considering whether they can be refactored into smaller "steps" first or otherwise made more efficient so that you don't need to rerun everything from the top.

Leonardo, the performance of the console is bad. Other consoles print much faster. It is a concern, if you have a script with several replace commands or something similar, it takes forever. It is a usability issue. Of course we can work around it, trading off productivity.
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2389
#263

18 Apr 2024, 11:08

Originally posted by patricio cuaron View Post

Leonardo, the performance of the console is bad. Other consoles print much faster. It is a concern, if you have a script with several replace commands or something similar, it takes forever. It is a usability issue. Of course we can work around it, trading off productivity.

We can disagree about what is reasonable or a hindrance to productivity.

In your specific example about replace commands, it seems to suggest you have large data, and the bulk of the time is the act of making those changes, not printing to the screen. Can you provide a reproducible example of this?
Comment
Mohammed Essa

Join Date: Sep 2023

Posts: 4
#264

29 Apr 2024, 17:30

One thing that would be huge is having the survey set svy be able to directly support predictive models e.g., Lasso or elastic net regression. Currently, it is not directly supported.
Comment
Ariel Linden

Join Date: Apr 2014

Posts: 153
#265

02 May 2024, 13:41

If it hasn't been mentioned already: please, please, please, speed up the mixed models!!! Even with modest sized samples these models take forever to run. One recent analysis took over 3 months to run on a sample size of < 100,000! And, I am running a 12-core MP version with a tremendous amount of memory and hard-drive space available.... So it's not a computer issue.
1 like
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1425
#266

03 May 2024, 02:09

Ariel Linden : out of curiosity are you referring to non-linear mixed models? Or all including linear mixed models? (I.e., which "me" model.) Any specific error structures relevant to know as well? [I've found non-linear mixed models slow to fit in the past (pre-"me") but linear ones not so bad.)
1 like
Comment
Ariel Linden

Join Date: Apr 2014

Posts: 153
#267

03 May 2024, 11:27

Hi Stephen,

All models, even -mixed-, run extremely slow.

As a concrete example, I am currently running a -mixed- model on 22,000 observations and it has been running for 10 days so far. I have optimized the model by specifying the options "difficult", "nostderr", "nolrtest".

I have tried various covariance structures in the past, and none of them appear to be superior in terms of speeding up the model estimation process. My current model is using cov(independent).

Cheers,

Ariel
Comment
David ODriscoll

Join Date: Sep 2022

Posts: 15
#268

04 May 2024, 13:56

Latent Transition Analysis Please
1 like
Comment
Joseph Nwadiuko

Join Date: Nov 2017

Posts: 30
#269

05 May 2024, 00:44

Originally posted by Zvonimir Kulis View Post

Global Moran I test with any weight matrix.
Bayesian comparison approach to choose between SDM and SDEM (The Bayesian comparison approach has been applied successfully in: (1) Firmino Costa da Silva D. , Elhorst J.P., Neto Silveira R.d.M. (2017), Urban and Rural Population Growth in a Spatial Panel of Municipalities, Regional Studies 51(6): 894-908.)

I agree. While I am glad that stata has integrated spatial econometrics including for panel data, the lack of econometric tests for frequentist or Bayesian approaches limits the utility. In particular it would useful to have Spatial Hausman and Legrange Multiplier and Robust Legrange Multiplier tests, etc to help diagnose the right fit for model.

Another feature that would be useful would be including some of the features of reghdfe into spatial regression models, at least the absorb (option), e.g., for when we want to use higher-level or interactive fixed effects.
1 like
Comment
Bert Lloyd

Join Date: Apr 2014

Posts: 105
#270

05 May 2024, 11:49

Originally posted by patricio cuaron View Post

When invoked from the command box from the main UI, `qui do script.do` runs much faster than `do script.do`. The seems seems to be a small delay (~50-100ms?) between each line printed to the console. This is way too slow in 2024 and hinders productivity when writing/debugging long scripts, especially given the lack of a step-by-step execution of scripts for debugging. Eliminating this delay would speedup actual workflows greatly.

Are you perhaps using a laptop with an external monitor? I have found this lag to be greater when the Stata window is displayed on the external monitor, especially a 4K monitor. In my experience, the lag is much smaller if the Stata window is displayed on the laptop screen itself.

Another thing you could do is run shorter test versions as usual (do whatever.do) but then run in batch mode once you've worked out bugs in the test version. Presumably there will not be lags from printing to console when running in batch mode.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment