Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is there a way to use the [if] syntax without checking that the named variable exists in the currently loaded dataset?

    The built in command use has the ability to condition on an if or in statement when loading a new dataset. It would be nice to have this syntactic convenience available when writing user programs to pass through to the use command, but unless I'm wrong, it appears that specifying [if] will always check whether the conditioning variable exists (or in the case of [if] whether it is out of range of the existing dataset).

    Is this check baked into the if syntax and is there any way around it? I know I could hack similar behavior with an option(string) or something, but the if/in syntax is just so convenient.

  • #2
    I don't understand what you want. Why would you want to be able to specify an -if- condition that involved a non-existent variable? And given that the variable does not exist, how would you want Stata to interpret the condition, as true or as false? None of this makes sense to me. I suspect I'm missing something.

    Comment


    • #3
      It appears you believe that Stata executes any if clause or in clause independently of the command it is attached to, so that the if or in would refer to the data in memory rather than the data in the using dataset.

      It is my belief from reading the output of help syntax that the if and in clauses are parsed and passed to the command to interpret appropriately.

      I'm short of time at the moment, but writing a simple program that does nothing other than display "hello, world", but includes an optional if clause in its syntax, and then testing it would seem a means to obtain a definitive answer.

      Comment


      • #4
        Here is an illustration of the problem

        Code:
        program doesnotwork
            version 14
            syntax [ if ]
        end
        
        sysuse auto , clear
        doesnotwork if foo == 42
        produces

        Code:
        . sysuse auto , clear
        (1978 Automobile Data)
        
        . doesnotwork if foo == 42
        foo not found
        r(111);
        
        end of do-file
        Now imagine doesnotwork was use and imagine the dataset myfoodata.dta contained the variable foo, then

        Code:
        sysuse auto , clear
        use if foo == 42 using myfoodata.dta
        works, even though foo is not a variable in the (currently loaded) auto dataset. You may not want to do this often, but it would sure be nice if you could add some qualifier to the if statement in syntax that would prevent checking the following expression. You can program this using regular expressions (or other means). Whether it is worth the effort, I cannot tell.

        Edit: see approach in #5 below.

        Best
        Daniel
        Last edited by daniel klein; 10 Apr 2018, 07:27.

        Comment


        • #5
          Actually, depending on what you exactly Malcom wants to do,


          Code:
          program mycmd
              version 14
              
              syntax anything(everything)
              
              parse anything
              ...
          end
          might be a good start. The anything(everything) statement causes any if, in, and using elements into anything.

          Best
          Daniel

          Comment


          • #6
            Originally posted by daniel klein View Post
            The anything(everything) statement causes any if, in, and using elements into anything.

            Best
            Daniel
            Thanks Daniel. This is exactly what I was getting at. The use command allows the use of if/in conditional on the dataset being loaded, but a user written program cannot ordinarily pass through the if/in using the [if] [in] syntax because it will check the current dataset for the variables and break out immediately.

            Thanks for the suggestion. I never did quite grasp what anything(everything) was doing until now. Consider the following

            Code:
            sysuse auto
            save auto
            
            program autouse
            syntax anything(everything)
            use auto `anything'
            end
            
            autouse if headroom<=3.5 in 1/10
            list headroom
            autouse in 1/1000
            autouse if foo ==1
            This properly passes if and in to the "use auto" statement without complaint and will then kick out an error only if the conditions fail to validate on the using dataset.

            It's sort of too bad that syntax can't declare a separate varlist of namelist prior to the the if/in declaration, but it's a start.

            Comment


            • #7
              Originally posted by Malcolm Wardlaw View Post
              It's sort of too bad that syntax can't declare a separate varlist of namelist prior to the the if/in declaration, but it's a start.
              You would need to program this, parsing the anything statement with low-level parsing commands, such as gettoken and/or regular expressions. Here is one approach

              Code:
              program mycommand
                  version 14
                  
                  syntax anything(everything)
                  
                  while (`"`anything'"' != "") {
                      gettoken tok : anything
                      if (inlist(`"`tok'"', "if", "in")) {
                          continue , break
                          // NotReached
                      }
                      gettoken var anything : anything
                      local varlist `varlist' `var'
                  }
                  
                  display "varlist is : `varlist'"
                  display "anything is: `anything'"
              end
              
              mycommand foo bar if (foo == 42)
              gives

              Code:
              . mycommand foo bar if (foo == 42)
              varlist is : foo bar
              anything is:  if (foo == 42)
              You will probably need something more sophisticated for anything serious and/or in the public domain. For example, if(exp) (note missing blank after if) will not be parsed correctly. There are likely many more problems, which I leave for you to figure out.

              Best
              Daniel

              Comment


              • #8
                Thanks again.

                I'd been playing around with tokenizing the input. My first impulse was to just parse it with regular expressions, because Its a hammer I know how to swing pretty well and I like having 2 problems.

                Honestly, the scripts are largely convenience functions for myself, so I can live with a little bit of edge case weirdness, but I wanted to thank you for the suggestion. Tokenizing the inputs and looping over them seems to be the preferred way of parsing Stata programs with complex inputs, and it's useful to see a custom example like this.

                Cheers,
                Malcolm

                Comment

                Working...
                X