Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Disable the Stata to recognise a variable by part of the variable name

    I have tried to search for similar question but couldn't find any. So I hope to create a new question here. If I have a variable called "variable1" and run the command "sum variable", then Stata will show the results of "sum variable1" as long as "variable1" is the only variable beginning with "variable" and thus makes no confusion. However, I hope to disable this function and let the Stata return the outcomes only if all the variables in commands are spelled with their full names. Could anyone please let me know how to change this default setting?

  • #2
    Code:
    help set varabbrev

    Comment


    • #3
      Jacky Zh When I first started using Stata, I, too, sought to disable the recognition of variables by partial names. But be careful. As you work with Stata, you may end up working with code written by others that relies on this feature of Stata. You may end up getting strange error messages thrown by programs that our code calls, and have difficulty understanding what is going on.

      Over the years, I have gotten used to Stata's acceptance of shortened variable names, and have even come to appreciate its convenience when working with other people's data sets that have very long variable names that are annoying to type. And, fortunately, Stata is very careful never to accept a shortened variable name if there is any ambiguity as to which variable is being referred to, so it is perfectly safe to use.

      Comment


      • #4
        And, fortunately, Stata is very careful never to accept a shortened variable name if there is any ambiguity as to which variable is being referred to, so it is perfectly safe to use.
        But, variables and scalars share name space ...
        Code:
        set varabbrev off , permanently
        should be default.

        Comment


        • #5
          But, variables and scalars share name space
          True, and I agree that this is poor software design. But -set varabbrev off- does nothing about this:

          Code:
          . clear*
          
          . sysuse auto, clear
          (1978 Automobile Data)
          
          . set varabbrev off
          
          . scalar mpg = 999
          
          . display mpg
          22
          
          . display scalar(mpg)
          999
          Had I been the creator of Stata it would not have allowed variable name abbreviation, and it would not have scalars and variables sharing the same namespace (especially since Stata allows variable names to be used in environments (like -display-) that are inherently oriented towards scalars. But that is not what StataCorp did, and at this point we are out to version 16.1. The best designed Stata programs avoid these issues by avoiding overloading any names (often by using -tempnames-, which are guaranteed to be unique). But not all do, and it just bears noting that if you -set varabbrev off- you cut yourself off from some existing Stata programs.

          Comment


          • #6
            Abbreviation of variable names and indeed of command names divides even experienced Stata users. I adopted it early on as a feature in my personal habits, but respect the argument that abbreviation is a bad idea. But what every user is likely to need to know if they stick around is that many people use abbreviation. If set varabbrev off breaks a script, you get to hear about it.

            Comment


            • #7
              Thanks a lot, Nick, Clyde, Bjarte! Now I understand the choices to deal with this issue and the trade-off I need to make. I really appreciate your answers!

              Comment


              • #8
                Originally posted by Clyde Schechter View Post
                [...] often by using -tempnames-, which are guaranteed to be unique
                Unfortunately, temporary names are not guaranteed to be unique.

                Code:
                . capture program drop foo
                
                . 
                . program foo
                  1.     tempname x
                  2.     display "`x'"
                  3.     scalar `x' = 73
                  4.     display `x'
                  5.     display scalar(`x')
                  6. end
                
                . 
                . clear
                
                . input __000000
                
                      __000000
                  1. 42
                  2. end
                
                . 
                . display __00000
                42
                
                . 
                . foo
                __000000
                42
                73
                
                . 
                end of do-file
                It is just unlikely that someone has created variables (or scalars) with names like __000000. I also believe that StataCorp warns against using names that start with underscores, as those are often used for "internal" variables; I cannot remember where this warning is issued in the manuals.

                Personally, I find abbreviating variable names (and commands) convenient, especially when I work in the command line. However, the use of wildcards (~, *, ?) actually make abbreviating variable names almost unnecessary; I do not see much difference between typing f and f~ when you want to refer to foo.
                Last edited by daniel klein; 29 Dec 2020, 08:17.

                Comment


                • #9
                  Interesting. Thanks for pointing that out!

                  Comment


                  • #10
                    Interesting discussion. I personally avoid this problem by keeping my variables names as short as possible while maintaining proper variable labels. There is no need to have long variable names if we use variable labels properly.

                    Comment

                    Working...
                    X