Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • if statement vs if condition

    Here's something I've wondered about for a long time. I often find myself writing code like this:
    /* do something */ if x==0
    /* do something else */ if x==0

    It always seems to me that I should be able to get the same result like this:
    if x==0 {
    /* do something */
    /* do something else */
    }

    But I never do get the same result. Why not? What am I not understanding about the difference?

  • #2
    More people have wondered about this, myself included, because of that someone wrote a FAQ
    on it: https://www.stata.com/support/faqs/p...-if-qualifier/

    But I must admit, it feels a bit strange.

    Comment


    • #3
      OK, so is there any alternative to writing a million statements that all have the same if condition?

      Comment


      • #4
        In the sense that'd you like a statement that does this:
        Code:
        if x==0 {
        /* do something */
        /* do something else */
        }
        But, for all observations and with commands that don't support the ', if' option?

        I'd say you could loop over your observations. For example, say you've got 8,000 observations:

        Code:
        forval i = 1/8000 {
                if var[`i'] == 2017 {
                    display "I'm printing a statement here, but you can do whatever you like"
                }
                }
        But in most cases the If qualifier is the quickest I'd say. Every time I've googled for things like 'loop over obs stata' there's a warning from some experienced Stata users to be careful with loops like that. I'm not exactly sure why though.

        Comment


        • #5
          Originally posted by paulvonhippel View Post
          OK, so is there any alternative to writing a million statements that all have the same if condition?
          That really depends on what exactly you want to do. Perhaps

          Code:
          preserve
          keep if ...
          do stuff // no if qualifier needed
          restore
          will do?

          Originally posted by Jesse Tielens View Post
          In the sense that'd you like a statement that does this:
          Every time I've googled for things like 'loop over obs stata' there's a warning from some experienced Stata users to be careful with loops like that. I'm not exactly sure why though.
          It is because looping over observations is very slow and usually there are better alternatives.

          Best
          Daniel

          Comment


          • #6
            is there any alternative to writing a million statements that all have the same if condition?
            What precisely is an example of the same if condition?

            There are two syntaxes because there are two quite distinct operations. The if command makes one decision based on truth or falsity of what is tested. That test need not make any reference to variables in the data. Thus many Stata programs have early syntax something like

            Code:
            marksample touse 
            count if `touse' 
            if r(N) == 0 error 2000
            The essence there is: count how many observations are available to do the thing the command is designed to do. If there are none, bail out with an error message. The if command works on the returned result r(N) left in the wake of count.

            In this example nothing happens if there are observations. Stata just proceeds to the next statement.

            If the if command (note how these typographical distinctions can help) does make reference to a variable without a subscript then the value in the first observation is used. That assumption is a surprise to many. The FAQ cited has, in my view, the most common question backward. The most common question is why that if command produces puzzling results. It is usually because the first observation is being used.

            The if qualifier (not an option!) often makes reference to variables although not doing so is perfectly legal. When it does make reference to variables without subscripts then there is an implied loop over observations. Here is an example

            Code:
            gen neglog = log(x) if x > 0 
            replace neglog = 0 if x == 0 
            replace neglog = -log(-x) if x < 0
            Each of these statements is a loop over x[1], x[2], x[3], and so forth.

            This is a terrible example in the sense that experienced Stata programmers would often prefer to avoid if there, but no matter.

            What often confuses people is knowing other languages! In some the if command is actually a loop over data. To people used to many languages the if qualifier is very puzzling on first acquaintance.

            Mata is more like other languages in this and other respects!

            Comment


            • #7
              If you do know other languages, an "if condition" of the form

              Code:
              replace y = 1  if x == 0
              is akin to working on a subset of the data. In pseudo-code
              Code:
              y[x == 0] = 1
              That is, "if x == 0" is similar to a array of True/False or 1/0 in languages that support that type of sub-scripting of objects. (That's not always the case, of course, since Stata allows for some flexibility in this regard, but as a general rule I find this a helpful way to think about it.) By contrast, the "if statement" is just like if statements in any other language: It evaluates a single expression and then executes the code that follows if true, and ignores it if not.

              I suspect what might be confusing is that Stata often interprets the variable name "x" as "x[1]", that is, the first value of variable "x". So

              Code:
              if x == 0 {
              disp "hi"
              }
              is actually

              Code:
              if x[1] == 0 {
              disp "hi"
              }
              And you can see that "x[1] == 0" compares a single number with a single other number. You can confirm this by typing "disp x" and see that it will print the first value of x and nothing more.

              Comment

              Working...
              X