Wishlist for Stata 19

John Mullahy

Join Date: Dec 2016

Posts: 732
#286

04 Jun 2024, 14:11

If it's technically feasible it would be valuable for v19 to offer an expanded set of math symbols supported by SMCL in graph text.

Code:

help graph text
5 likes
Comment
Gary Coriott

Join Date: Jun 2024

Posts: 1
#287

07 Jun 2024, 07:10

Kerberos authentication. And a better JDBC command. The current one requires an openly declared username and password.
Comment

Dirk Enzmann

Join Date: Apr 2014
Posts: 516

#288

09 Jun 2024, 06:41

Add an option to cmdlog to log the commands of included do-files with the commands of do-files indented using tabs or spaces and/or separating commands of do-files using horizontal lines. Commands of nested do files should be logged, as well. For example:

Code:

sysuse auto, clear
describe foreign
tab1 foreign
* do "/datadisk/Stata/prepare.do"
* ------------------------------------------------------------------------------
    recode foreign (1=0) (0=1), gen(domestic)
    label define domestic 0 "foreign" 1 "domestic"
    label values domestic domestic
    * do /datadisk/Stata/check_recode.do
    * ------------------------------------------------------------------------------
        label list origin
        label list domestic
        tab2 foreign domestic
    * ------------------------------------------------------------------------------
* ------------------------------------------------------------------------------
describe domestic
tab1 domestic

It would also document the commands included in temporary do-files. This would show the commands actually used and would improve the possibility to replicate analyses considerably.

Comment

Lili Bulfone

Join Date: Apr 2021

Posts: 11
#289

11 Jun 2024, 20:55

Almost every comparative clinical trial that is reported in the literature includes a forest plot showing treatment effects in subgroups of patients (such as the one shown in figure below). While such forest plots can be developed using David Fisher's ipdover ado, it would be extremely valuable for there to be a straightforward way of generating these graphs from the drop-down menus in Stata; if the graphs had the option of showing p-value for interaction and for showing median survival and number of events and perhaps also a test to determine whether the covariate was prognostic of outcome, then this would be perfect.
3 likes
Comment
Fernando Furquim

Join Date: May 2014

Posts: 20
#290

13 Jun 2024, 10:12

Perhaps there's a good reason for this limitation, but it would be convenient if over() option worked with line graphs
Comment
Nikita Zakharov FreiburgUNI

Join Date: Jun 2024

Posts: 2
#291

14 Jun 2024, 08:36

Dear Statalist users,

There is a great paper on the Granular Instrumental Variables (GIV) by Gabaix, X., & Koijen, R. S. (2020 forthcoming in JPE), offering a novel way of identifying causal effects. I am looking for a Stata code/command to implement it for my research, but so far, I have found only the Python code, which I cannot interpret as a correct or incorrect one. Have any of you already encountered the GIV econometric procedure and perhaps can point me to the code in Stata that I could reuse? Much appreciated in advance!
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2365
#292

14 Jun 2024, 08:40

Originally posted by Nikita Zakharov FreiburgUNI View Post

Dear Statalist users,

There is a great paper on the Granular Instrumental Variables (GIV) by Gabaix, X., & Koijen, R. S. (2020 forthcoming in JPE), offering a novel way of identifying causal effects. I am looking for a Stata code/command to implement it for my research, but so far, I have found only the Python code, which I cannot interpret as a correct or incorrect one. Have any of you already encountered the GIV econometric procedure and perhaps can point me to the code in Stata that I could reuse? Much appreciated in advance!

This would have been better as its own post as rid thread is intended for users to discuss new features for Stata.

At a risk of stating the obvious, if the paper linked is only recently appearing in print, it’s unlikely anyone but the authors would have implemented such a method in Stata. If you need this method, you might implement it yourself or use Python from Stata.
1 like
Comment
Nikita Zakharov FreiburgUNI

Join Date: Jun 2024

Posts: 2
#293

14 Jun 2024, 08:46

Dear Leonardo Guizzetti,

Thank you for your quick response! The original paper indicates that many new studies are adopting this approach, so I think people have already figured out how to code it correctly. I would only save time and avoid potential errors if I could reuse some standard code. Or even better: if there is a user-written command in Stata that I am unaware of. But I see your point! Perhaps I should start a separate thread.
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1109
#294

14 Jun 2024, 15:58

Hello Lili Bulfone. I want to check that I understand correctly the figure you showed in #289. I think that effects you plotted come from 4 distinct models as follows:
Overall effect: Treatment (Cetuximab vs Best Supportive Care) is the only variable in the model.

ECOG (0/1 v 2): Treatment, ECOG, and Treatment*ECOG are the only terms in the model.

Age group (<65 vs 65+): Treatment, Age Group, and Treatment * Age Group are the only terms in the model.

Sex (F vs M): Treatment, Sex, and Treatment * Sex are the only terms in the model.

Have I understood correctly what your plot shows? If not, please clarify.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment
Lili Bulfone

Join Date: Apr 2021

Posts: 11
#295

14 Jun 2024, 21:30

Originally posted by Bruce Weaver View Post

Hello Lili Bulfone. I want to check that I understand correctly the figure you showed in #289. I think that effects you plotted come from 4 distinct models as follows:
Overall effect: Treatment (Cetuximab vs Best Supportive Care) is the only variable in the model.

ECOG (0/1 v 2): Treatment, ECOG, and Treatment*ECOG are the only terms in the model.

Age group (<65 vs 65+): Treatment, Age Group, and Treatment * Age Group are the only terms in the model.

Sex (F vs M): Treatment, Sex, and Treatment * Sex are the only terms in the model.

Have I understood correctly what your plot shows? If not, please clarify.

Hi Bruce,
Thanks for showing interest in my suggestion. The diagram I showed is from the publication of a trial (Jonker et al. Cetuximab for the Treatment of Colorectal Cancer. NEJM. 2007;357:2040–2048 (https://www.nejm.org/doi/full/10.1056/NEJMoa071834) so I am not 100% sure that this is what they did but my understanding aligns with yours:
first set of analyses is the overall treatment effect from a Cox model (assuming an assumption of proportional hazards is not violated) fit to the survival data i.e., as might be generated using the command 'stcox trt' after declaring data to be survival data; no test for interaction is reported because there is only the one variable in the model & the median survival might be generated using the command 'stci, by (trt)' or something equivalent

the second set of analyses reports the hazard ratios with both trt and ECOG (as a binary variable of 0/1 [0] vs 2 [1]), which might be generated using a command stcox trt if ECOG==0 then stcox trt if ECOG==1 (however one could do this as a 3-level if the ECOG is reported as 0, 1, 2); the result of test for interaction (looking at whether the effect of treatment is modified by ECOG status) would be generated using a command 'tr t## i.ECOG' and then testparm 'trt # i.ECOG' and median survival for each combination of treatment and ECOG generated using 'stci, by (trt ECOG)'

third and fourth set of results use the same approach as described for ECOG (so that the only terms in the model are the pair of trt and age and then trt and sex).
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1109
#296

15 Jun 2024, 10:23

Thank you for sharing the article link, Lili Bulfone. This excerpt is the only thing I have found so far describing what is shown in Figure 2:

In a planned subgroup analysis, no significant differences in the relative benefit of cetuximab were seen across subgroups defined on the basis of ECOG performance status at baseline, age, or sex (Figure 2).

After reading that, I suspect we have both guessed correctly what was done, but I am still not 100% sure. And if we are correct, I would have described this as a series of subgroup analyses. ;-)

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment
Lili Bulfone

Join Date: Apr 2021

Posts: 11
#297

17 Jun 2024, 05:35

Hi again Bruce,

Originally posted by Bruce Weaver View Post

Thank you for sharing the article link, Lili Bulfone. This excerpt is the only thing I have found so far describing what is shown in Figure 2:

After reading that, I suspect we have both guessed correctly what was done, but I am still not 100% sure. And if we are correct, I would have described this as a series of subgroup analyses. ;-)

Like you, I'm not 100% sure but am only a smidge off 100% that you are correct and it is a SERIES of subgroup analyses being presented. Regulatory guidelines for analysis of treatment effects from clinical trials of medicines typically suggest investigation to determine either consistency or substantial differences in the magnitude of the treatment effect among different patient subgroups. Demonstration of different effects between different subgroups is recommended to be done one factor at a time and, ideally, should be presented with the results of an appropriate test for interaction. Personally, I also like to see some reporting of absolute effect (e.g., median survival or proportion of patients with the event of interest as was done in the third and second examples below, respectively) to allow some insight into whether the factor is a prognostic variable (e.g., survival is better in a subgroup vs the complement regardless of the treatment even though, on the relative scale, the treatment effect is the same (or, more accurately, is not proven to be different). In case it is helpful and to show just how ubiquitous these types of subgroup analyses are, here are 3 figures from 3 highly regarded medical journals showing these types of analyses in just the last week. The first one is from The Lancet (https://www.thelancet.com/journals/l...736(24)00921-8), the second one is from the New England Journal of Medicine: https://www.nejm.org/doi/full/10.1056/NEJMoa2404245, and the third one is from Blood https://ashpublications-org.virtual....referable-over . As I said in my original post, there is a way to generate these graphs using David Fisher's ipdover ado but it is a bit of faffing around. I am sure I'm not the only lazy person out there who wants to be able to press a button after running my commands like stcox trt if sex==0 then stcox trt if sex==1 and see the plot of the hazard ratios, the results of the test for treatment effect modification and median survival or proportion of patients with the event in each subgroup 😂.

Thanks again Bruce for being interested.

Attached Files
Comment
Fahad Mirza

Join Date: Sep 2018

Posts: 237
#298

20 Jun 2024, 01:16

Request for developers at Stata

When we use fonts that contain icons in a visualization for example in this case a Country Flag font (https://www.fontspace.com/flags-color-world-font-f41090), Stata overrides the color of the flag with the default color. Is there a way to display the original colors of the font? If not, then would be nice to consider this for a future update

Currently it looks like the image below:

Code used after installing font:

Code:

twoway (scatteri 78 74 `"{fontface flags color world:u}"', ms(i) mlabsize(50))
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1361
#299

20 Jun 2024, 08:42

Originally posted by Silvia Gib View Post

I really hope that Stata 19 will let us import parquet files, they are everywhere now. The only solution I found is an user written command and only works with Linux.

Silvia Gib I gave a talk several years ago on a Stata command that can ingest parquet files (see

HTML Code:

https://github.com/wbuchanan/readit

. Brian Quistorff submitted a pull request not too long ago that extends the functionality to also write parquet files too.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1361
#300

20 Jun 2024, 08:52

1. Please please improve the performance of -reshape- or update the C API that -gtools- uses. With a moderately large number of variables to reshape and a large N, -greshape- will throw an error referencing an issue in the C API that limits the number of observations that can be referenced/assigned and reshape can easily take more than a day (one time it took roughly 31 hours for a task).
2. Adjust the performance of the order command. When used with large N and large K it can take multiple hours. I wonder if it would be possible to use pointers to the underlying variables to update the ordering without having to adjust the memory addresses of the data (I'm assuming that is what is happening under the hood given amount of time it was taking).
3 likes
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment