Announcement

Collapse
No announcement yet.
This is a sticky topic.
X
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • I updated to version 18 a month ago and I like it but there are a few basic things I'm kind of shocked aren't already implemented.
    All of these requests are absolutely basic.
    If any of these are implemented already, please let me know how to do them. Thank you!
    Here's my short list:

    1) When importing a spreadsheet:
    - cannot ask for it to be transposed
    - cannot specify rows to be variable labels or variable notes as opposed to being data
    - in other words multiple columns of variable annotations
    - I'm trying to import gene expression data with 30K genes and what am I supposed to do, enter 30K labels by hand?

    2) When coloring by a variable, it's always treated as numeric, if it's categorical then why can't we have the legend use value labels?

    3) When editing using the data editor, there's no undo. In fact there's no undo generally in Stata. That seems like it would have been an oversight in Stata 1, how did we get to 18 without it?

    Comment


    • what am I supposed to do, enter 30K labels by hand?
      Gregory Grant You can still do this (and the transpose) programmatically, though you need more steps than the import command. If you make a new thread and ask this as a question with a -dataex- example, I bet you'd get code quickly. Or you can search the forums since this has been asked about before.

      Comment


      • Originally posted by Daniel Schaefer View Post

        Gregory Grant You can still do this (and the transpose) programmatically, though you need more steps than the import command. If you make a new thread and ask this as a question with a -dataex- example, I bet you'd get code quickly. Or you can search the forums since this has been asked about before.
        Thank you for this quick reply, I will post the questions directly. But why on earth would they make something this basic that hard to do? It doesn't make sense to me.

        Comment


        • My request concerns the examples in the windows that pop up when one types help command. Take help regress as an example. Currently, the first part of the examples section looks like this:

          Setup
          . sysuse auto

          Fit a linear regression
          . regress mpg weight foreign

          Fit a better linear regression, from a physics standpoint
          . gen gp100m = 100/mpg
          . regress gp100m weight foreign

          Obtain beta coefficients without refitting model
          . regress, beta

          Suppress intercept term
          . regress weight length, noconstant

          Model already has constant
          . regress weight length bn.foreign, hascons

          I wish it looked like this so that users could copy, paste, and execute the code with no need to edit anything first:

          Code:
          // Setup
          sysuse auto, clear
          
          // Fit a linear regression
          regress mpg weight foreign
          
          // Fit a better linear regression, from a physics standpoint
          generate gp100m = 100/mpg
          regress gp100m weight foreign
          
          // Obtain beta coefficients without refitting model
          regress, beta
          
          // Suppress intercept term
          regress weight length, noconstant
          
          // Model already has constant
          regress weight length bn.foreign, hascons
          Notice too that I added a clear option to the use command and replaced gen with generate. Although gen is probably clear enough, I think it is good practice to spell out commands fully for the benefit of Stata newbies at least. E.g., I remember one occasion where I had a heck of a time figuring out that di meant display. YMMV, of course.

          Thanks for considering.
          --
          Bruce Weaver
          Email: [email protected]
          Version: Stata/MP 18.5 (Windows)

          Comment


          • Would be great to have a Polars integration in Stata as they did for Python and R. Stata is a pain with large datasets. Gtools helps but has only few commands.
            Last edited by Henry Strawforrd; 19 Jan 2024, 03:56.

            Comment


            • I am a big fan of the bulk selection methods in Stata e.g.
              Code:
              tab *
              Code:
              keep s*
              But this doesn't work when I want to tab variables whose names start/end/contain a certain character. My dream is a wild one that doesn't work:

              Code:
              tab ^a*
              This applies to other types of analyses and wrangling and cleaning names/labels as well. I have to switch to R to do the regex processings (the stringr and stringi packages are simply euphoric). And it'd be euphoric to have such features in Stata 19.
              Last edited by Sonnen Blume; 20 Jan 2024, 08:44.

              Comment


              • I don't understand what is being requested in #201. It seems that what is a fact about the -tab- command is being confused with the workings of wildcards in Stata varlists.

                -tab- takes either one or two variables in its varlist, that is all. -tab a*- will, in fact provide a cross tab of the two variables in the data set whose names start with a, if there are exacatly two of them. If there is only one, you get a tabulation of that. If there are more than 2, then the command will fail because -tab- only allows two variables.

                If what is wanted is to be able to get a series of tabulations of the single variables all of which begin with a, that can easily be done, but with -tab1-, not -tab-.
                Code:
                tab1 a*
                will do that. To get a series of tabulations of the single variables all of which contain a somewhere, that's -tab1 *a*-. And for variables ending in a, -tab1 *a- does it.

                No need for regular expressions or resort to Python or other software to do this kind of thing.

                Comment


                • A simple request for a convenience feature - to be able to see line numbers in the viewer. Perhaps, as a toggle somewhere in the menu. Sometimes need a quick tool to see a file online:

                  Code:
                  view "http://somesite.com/data.csv"
                  Currently have to download a temp copy, then open in the do-files editor (which doesn't support web files - error 632).

                  Comment


                  • Also a small request:

                    Have an option added to -export delimited- which allows for explicit printing of missing values to the output. The only behaviour right now (for system missing) is to print nothing, such as two consecutive delimeters. Existing precedent for this exists in the older -outfile- which will print a period for system missing values. A nice extension of this behaviour would be to allow a user specified value for those missing values (e.g., -999).

                    This is useful for the minority of users that need to export datasets for use in programs like MPlus, which require explicit values/missing characters for every value.

                    Comment


                    • I would really need a command that gives me the coordinates of the edges of an object within a graph. For example, if I have a linear function with an upward slope, I would want to get the coordinates that the line occupies. This would radically increase the options to customize graphs.

                      Comment


                      • Originally posted by Ariel Linden View Post
                        I would very much like to see:

                        (1) a suite of official machine learning tools (there are several user-written commands but only lasso and npregress are official commands, and they certainly don't represent the current standard).

                        (2) a way to speed up mixed models (all of them). I have projects in which mixed (which is certainly the "fastest" of the bunch) has taken 12 WEEKS (yes, weeks!) to complete. That's ridiculous. And I am using an extremely expensive version of Stata for 12-cores! The slowness of multilevel models leads to bad practices. If it takes me 12 weeks to run a single model using -mixed-, I may use -mixed- for a binary outcome because I know that using -melogit- may take twice as long. I am not sure why this cannot be performed as a parallel process to speed it up, or some other mechanism...

                        Fingers crossed!

                        Ariel
                        You get very little or no benefit from running -mixed- on multicore processors because the command hasn't been "parallelized". See https://www.stata.com/statamp/perfor...ort/report.pdf. But your point also brings up a related topic, which is the requirement to pay additional money for the privilege of running Stata MP on 4- or 8-core machines, which are just consumer-level, basic computers today. If someone was paying extra for a software version that would maximize the power, of say, a 64-core processor (which seems fairly exotic to me), I'd understand it. But 4 or 8 cores are in very basic computers today. There shouldn't be a premium for using all of the computing capacity of a consumer level device.

                        Comment


                        • #206 raises questions on several levels. We all know that StataCorp is a business and can have views on what is and what is not reasonably priced -- and that boils down to what we want and whether we or our employers are willing to pay.

                          But Anthony's post is seriously and factually wrong in important details. Stata/MP for example is about very much more than supporting more processors explicitly, You are getting support for much bigger datasets and much else. Look through

                          Code:
                          help limits 
                          to see many important differences. Coding to support those bigger datasets is immensely more challenging than just changing some constant within internal code. You're paying for what you get.

                          Comment


                          • Also in line with #207, we can all think of some alternative system that doesn't license by core (only physical CPU) but the costs are prohibitively greater for most. I would not wish to see the change suggested if it meant that Stata becomes more expensive.

                            Comment


                            • X-13arima-seats seasonal adjustment program and a build-in program for seasonal adjustment. Can't believe it there is none!

                              Comment


                              • I know I'm just dreaming here but:

                                A do file editor that underlines things like syntax mistakes or unrecognized commands in red. Better if it could make syntax suggestions in a tooltip on mouseover. Trace is good, but I'd love breakpoints with more tooltips showing me the current content of a macro or scalar on mouseover. Also, it would be great if I could manipulate and inspect state with the console while stopped on a breakpoint. Oh, and line numbers.

                                I've been spoiled by fully featured IDEs (and MATLAB) but I honestly love these features and would be ecstatic if they were (even partially) incorporated in the do-file editor. It doesn't have to be visual studio, but what about visual studio code? (Please don't make me write my own VSC plugin).

                                Comment

                                Working...
                                X