Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Strange scalar behavior in Stata 18.5

    Hello all, I am getting strange behavior from scalars in Stata 18.5 that I cannot explain and I am hoping someone can explain.

    I type something like:
    Code:
    scalar a = 1
    scalar b = -3
    scalar c = a - b 
    scalar d = c + 1 
    
    di a 
    di b
    di c
    di d 
    scalar list
    In StataMP 17.0, which I run on my personal computer, I get the following:

    Code:
    . di a
    1
    
    . di b
    -3
    
    . di c
    4
    
    . di d
    5
    
    . scalar list
             d =          5
             c =          4
             b =         -3
             a =          1
    That is, I get exactly what I expect. However, when I run the same code in StataNow/SE 18.5 (which I run on my work computer), I am getting:
    Code:
    . di a
    1
    
    . di b
    -3
    
    . di c
    0
    
    . di d
    0
    
    . scalar list
             d =          1
             c =          4
             b =         -3
             a =          1
    So there are two weird things here. First, -display- doesn't show the same contents for scalar c that -scalar list- shows. Second, the operations creating scalars c and d do not seem to be working correctly.

    Can anyone explain what might be going on? Thank you very much in advance!

  • #2
    I cannot replicate your problem on my setup:
    Code:
    . about
    
    StataNow/MP 18.5 for Windows (64-bit x86-64)
    Revision 18 Dec 2024
    Copyright 1985-2023 StataCorp LLC
    
    Total physical memory:       32.00 GB
    Available physical memory:   19.05 GB
    
    Stata license: Single-user 4-core , expiring 28 Apr 2025
    Serial number: REDACTED
      Licensed to: Clyde Schechter
                   Albert Einstein College of Medicine
    
    .
    . scalar a = 1
    
    . scalar b = -3
    
    . scalar c = a - b
    
    . scalar d = c + 1
    
    .
    . di a
    1
    
    . di b
    -3
    
    . di c
    4
    
    . di d
    5
    
    . scalar list
             d =          5
             c =          4
             b =         -3
             a =          1
    
    .
    end of do-file
    
    .
    On your work computer, restart your computer, and make sure that your Stata 18.5 is fully updated by running -update all, force-. If that does not solve the problem, I would uninstall Stata 18.5, reinstall it, and update that. If that does not work, then I think you need to contact Stata Technical Support.

    Comment


    • #3
      I can't replicate in v18. I get the correct results.

      Comment


      • #4
        What variables do you have in your datasets? Two rules could easily be biting here.

        1. Scalars and variables share the same name space.

        2. Forced to choose whenever scalar or variable would make sense, Stata will work with the variable with that name.

        Yet further

        3. Unless you have disabled variable name abbreviation, that is also relevant.

        4. Asked to display a variable, whether by accident or design, Stata will use or show its value in the first observation.

        #4 often comes as a big surprise even to experienced users of Stata.

        Although not a gambler of any kind, I am confident that this can be explained by checking variables in each dataset.

        Comment


        • #5
          Three posts in 3 minutes! They are likely to be entirely consistent. Clyde and George have no variables (with conflicting names).

          Comment


          • #6
            Nick Cox you are correct - I have variables named coretrt and disability, so what I thought were scalars c and d were in fact those variables. I had not loaded that dataset on my personal machine so the behavior was not replicated. I was unaware that display would choose the variables. Thank you very much, that was extremely confusing and was indeed a surprise! Clyde and George, thank you as well for taking the time to check this.

            Comment


            • #7
              See also Stata Tip 31: Scalar or Variable? The Problem of Ambiguous Names and its reference to documentation.

              You can insist on the scalar with scalar().
              Last edited by Nick Cox; 11 Feb 2025, 09:16.

              Comment


              • #8
                Funny. I plopped the scalars into an existing dataset and it gave me back all sorts of stuff. I then "clear all" and redo to get the right results. I should have recommended "clear all" but Nick actually explained what's what.

                Comment


                • #9
                  Nick's points are excellent.

                  I missed thinking about the possibility of a name clash because this kind of thing seldom arises in my workflow. All of my do-files begin* with -clear*- because I think it's always a good idea to start every computation with a "blank slate," so there is no interference from extraneous programs or data. It's just another aspect of information hiding, which is, I think, one of the most important principles in coding.

                  That said, I have never understood why StataCorp chose to have scalars and variables share the same namespace. Accordingly, I endorse daniel klein 's post about this issue in the Wish List for Stata 19 thread (https://www.statalist.org/forums/for...tata-19/page28).

                  * Well, I mean that -clear*- is the first command that actually does something to active memory. Literally, my do-files start with commands to open a log file and set the version number. -clear*- comes right after that.

                  Comment


                  • #10
                    From another point of view: I am reflecting on why in practice this doesn't usually bite me.

                    I guess it's because I rarely use scalars, except under temporary names, and if I do, I wouldn't use a name that was already in use for a variable. Scalars would have some descriptive name that would make less sense for a variable.

                    Even more important in principle is using local macros where some people might be tempted to use scalars.

                    If I had found myself being bitten by this, I would have made myself adopt some personal rule, such as always call scalars scalar_a or scalar_b or whatever.

                    That is really not intended to seem smug or patronising. just to reflect on practice in a way that might help.

                    The psychological or social dimension here might often lie in what other software or languages people have used before, or at least use often as well as using Stata. Once you have learned that

                    Code:
                    a = 1
                    is illegal in Stata (as compared with Mata) and that

                    Code:
                    gen a = 1
                    is wasteful in Stata then the next congenial step may be

                    Code:
                    scalar a = 1
                    Last edited by Nick Cox; 12 Feb 2025, 09:09.

                    Comment


                    • #11
                      I was thinking along the same lines as Nick, and for a quick and dirty experiment, I decided to check the possible time savings from using scalars vs. locals. I wrote a loop of length 1e6 that assigned a random number to a local, and then added that value to a sum stored in a local, and then did the same computation with scalars. The local version took about 45 sec, and the scalar version took about 30 sec. Most times when I've used scalars, I've only done a few computations rather than millions, so I'd suspect there are no plausible use cases in which the time Stata spends dereferencing a local will matter.

                      Comment

                      Working...
                      X