Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • capture command

    Hello!
    I am writting a default do file to similar but not identical datasets, so I need to use the - capture noisily - command in order to avoid the program to break when is it not applicable to one specific dataset.
    The problem is that I have to write - command noisily - before each single command, what makes the do file very messy and difficult to read. Is there another way to make it cleaner and to avoid do file breaks whenever it doesn't find a variable?

    Thank you!

  • #2
    -capture- and -capture noisily- can also be used on blocks of consecutive lines of code:

    Code:
    capture {
        clear*
        invalid command produces no output and does not break
        display "This won't display"
    }
    capture noisily {
            display "But this will"
            display "And so will this"
            display "And everything until the closed brace"
            invalid command produces error message but does not break
            display "As shown by this line's displaying"
    }
    display "This will display"

    Comment


    • #3
      Not withstanding the above code, may I suggest a safer programming practice. You state that the difference between the files is that one of them contains a variable that the other doesn't. Let's call that variable extra. You can guard the blocks of code that refer to variable extra as follows:

      Code:
      capture confirm variable extra, exact
      if c(rc) == 0 {
           // INSERT BLOCK OF COMMANDS REFERRING TO
           // VARIABLE extra HERE
      }
      This is better programming style because if there is some other difference between the two data sets in addition to one having an extra variable, your original approach will let the code blunder on through error after compounding error and produce nonsense that may masquerade as legitimate output. The code here will skip over code that is irrelevant when the variable extra does not exist, but if there is some other issue in the data set that you may not be aware of, the code will break and you will be made aware of the problem.

      Comment


      • #4
        Thanks for your suggestions! The problem about the second suggestion is that I have a lot of variables and my do file is organized in a task way (label all variables, then label values for all variables, then create new variables with the old ones, ...). So the second sugestion would make it enormous.

        I have problems like this below: the variable racacor may be string in a dataset (if the variable assumes those string values in the first line below) or numeric in another one, so I duplicated the code. Is there a smarter way of doing so?

        capture noisily replace racacor = "98" if racacor == "06"|"09"|"1M"|"1G"|"DE"|"D"|"87"
        capture noisily replace racacor = 98 if racacor == 06|09|87


        Thank you again!

        Comment


        • #5
          Well, those commands in #4 are perfect examples of why doing a lot of -captures- is a bad idea! Both of those commands are syntax errors and will simply not be executed. The substitution you seek will not occur. And you won't know until you work with the output and find that it isn't at all what you were looking for.

          Code:
          capture noisily replace racacor = "98" if inlist(racacor, "06", "09", "1M", "1G", "DE", "D", "87")
          capture noisily replace racacor = 98 if inlist(racacor, 6, 9, 87)
          is valid syntax that will do what I assume you intended.

          But the safer way would be:

          Code:
          capture confirm string var racacor
          if c(rc) == 0 { // CASE OF STRING VARIABLE
               replace racacor = "98" if inlist(racacor, "06", "09", "1M", "1G", "DE", "D", "87")
          }
          capture confirm numeric var racacor
          if c(rc) == 0 { // CASE OF NUMERIC VARIABLE
               replace racacor if inlist(racacor, 6, 9, 87)
          }
          Then there is another possibility, that there is no variable racacor at all. If that's OK, nothing more is needed. But if that indicates an error:
          Code:
          quietly des racacor // BREAKS IF racacor NOT FOUND; DOES NOTHING OTHEREWISE
          If you are processing the file one variable at a time, you can just have blocks of code like that, one for each variable. This is a much safer way to proceed because you are likely to make mistakes when you type your code (we all do), and the indiscriminate use of -capture- will keep you from discovering them--you will end up with incompletely processed files, but you won't know that until you try to use them later and they don't live up to your expectations. If you head each block of code with a comment describing what variable is being processed, and if you leave a few lines of whitespace between each block, the do-file will be well organized, easily readable, and easy to understand.

          I notice belatedly that you say your do-file proceeds task-wise rather than variable-wise. That may be un-wise in this case. For example, the -label values- task makes no sense for non-numeric variables. I think you will find that the whole thing works better if you do it variable by variable, and you can include in each variable's block of code those tasks that are appropriate to the particular data type of the variable that you actually encounter.

          I would worry more about the correctness of the do-file, its ability to correctly carry out your intentions, than I would about its size. A compact do-file that gets it wrong isn't very useful.

          Last edited by Clyde Schechter; 07 Jun 2015, 22:29.

          Comment


          • #6
            Corrections to #5.

            -capture noisily replace racacor = "98" if racacor == "06"|"09"|"1M"|"1G"|"DE"|"D"|"87"- is a syntax error, as I said, but
            -capture noisily replace racacor = 98 if racacor == 06|09|87- is not. But it will not do what I think is intended.

            The -if- clause will be interpreted as follows:
            Stata will look to see if racacor == 06, and if so the expression is true. If not, it then interprets 09 not as another value to compare racacor with, but as a Boolean expression in its own right. In Stata, numbers and numeric variables can function as Boolean expressions and are interpreted as true if they are non-zero, false otherwise. Since 09 != 0, this disjunct evaluates to true, so the entire -if- condition evaluates to true. So, the command shown will replace racacor = 98 in all observations.

            In my statement that you won't even know what went wrong, I over-reached a bit. The -capture noisily- commands will print out error messages when there are syntax errors or problems executing the command. So, in theory, you will see everything that went wrong. The problem is, I infer from your concerns about the length of the do-file, that this is going to be a very lengthy output file. And you are expecting many lines to be error messages, by design. Therefore, you are relying on your own ability to 1) spot every single error message in a long output, and 2) correctly classify those error messages as expected due to the design of your code vs an unanticipated problem. Few humans are capable of doing that with no errors. That, in fact, is one reason why the default behavior of Stata is to break whenever it finds a problem rather than just leave an error message and carry on. That default behavior is there to protect you from unrecognized mistakes, and you override it at your own risk.

            Comment


            • #7
              Thank you very much for you attention and suggestions, Clyde! They were very useful!

              Comment


              • #8
                I have a minor correction to #2: If one line within the brackets fails, then all subsequent lines until the closing brace will not be executed. Thus, in the given example,
                "As shown by this line's displaying" will in fact not be displayed. (as described in the technical note of the documentation of the capture command: "If any of the commands in the capture block fail, the subsequent commands in the block are aborted, but the program continues...")
                This is important because "capture noisily" with braces might in fact yield a very different output than "capture noisily" at the beginning of every line.
                Then again, this might just be another argument for the remarks in #3 and #5.

                Comment

                Working...
                X