Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • forvalues loop 0.05(0.05)0.15 only returns 0.05 and 0.1

    Hi Statalist,

    The following issue can be worked around but I'd like to know how this happens regardless because of how odd it is.

    So, in the process of setting up some nested for-loops that iterate through various values within a grid-search, I noticed that specific values for a loop do not return what they should (or at least what I expected):

    Code:
    forvalues i = 0.05(0.05)0.15 {
        di `i'
    }
    returns 0.05 and 0.1

    For every other forvalues-loop I had been using thus far the max value of the interval is also returned. See this super simple example:

    Code:
    forvalues i = 1(1)3 {
        di `i'
    }
    returns 1, 2, and 3 as you'd expect

    Now, I tried to check if there was an issue with the specific values in 0.05(0.05)0.15 so I extended the interval to 0.2 and 0.25 but then the loop returns the values including the max value of the interval just like I'd expect it to:

    Code:
    forvalues i = 0.05(0.05)0.25 {
        di `i'
    }
    returns 0.05, 0.1, 0.15, 0.2, and 0.25

    For the loop that was supposed to iterate though the values 0.05, 0.1, and 0.15 I instead used this:
    Code:
    foreach i in 0.05 0.1 0.15 {
        di `i'
    }
    It works, but I feel like I shouldn't have to do this. I'm new to Stata (thus far I mostly used python and SQL) so I wonder if this is somehow intended or a bug, because I have absolutely no clue why the first loop I posted doesn't work as intended.

    Using Stata 15.1 (up to date)

  • #2
    Hi Olaf
    What you are finding is really a problem of precision (I encounter this odd behavior often)
    When using decimals, Stata (and other software as well) transforms back and forth between decimal system to binary.
    As a consequence, sometimes loops like the one you suggest wont produce what you want:
    Here a couple of alternative examples
    Code:
    . forvalues i = 0.05 (0.05) 0.15 {
      2.         display %20.17f `i'
      3. }
     0.05000000000000000
     0.10000000000000001
    
    . 
    . foreach i of numlist 0.05 (0.05) 0.15 {
      2.         display %20.17f `i'
      3. }
     0.05000000000000000
     0.10000000000000001
     0.14999999999999999
    
    forvalues i = 1 / 3  {
        local j = `i'*0.05
        display %20.17f `j'
    }
     0.05000000000000000
     0.10000000000000001
     0.14999999999999999
    it seems to me that the second option may produce what you want more often than not. But often option 3 is also recommended.
    HTH

    Comment


    • #3
      What you should code is
      Code:
      forval i = 1/3 { 
            di `i' * 0.05 
      }
      as explained at https://www.stata-journal.com/articl...article=pr0051

      In a nutshell, note that this is a consequence of precision problems, i.e. the fact that computers use binary approximations is biting you.

      Code:
      .  mata : strofreal(0.05 * (1..3), "%23.18f")
                                1                      2                      3
          +----------------------------------------------------------------------+
        1 |  0.050000000000000003   0.100000000000000006   0.150000000000000022  |
          +----------------------------------------------------------------------+
      Multiples of 0.05 can't be held exactly in binary. The best approximation to 0.15 is more than 0.15 and so beyond the top limit you gave for your loop.

      Comment


      • #4
        Thanks guys, the quick responses, info, alternatives and linked article, which I'll take a look at right now, are all very much appreciated!

        Btw, I can recreate this in python as well (unsurprisingly), i.e. 0.05*3 =0.15000000000000002. Seems odd that I never encountered this issue up until now but I presume it won't be the last time.

        Have a good one!

        Comment


        • #5
          In addition, Asjad Naqvi has explained it recently in his blog: https://medium.com/the-stata-guide/t...n-f66a68c99bfc

          I have to say after reading this blog entry I might go back to this blog entry every time I start a new data project. It is a good reminder of what can go wrong....

          Comment


          • #6
            One safe approach (to me, and contrary to the manual advice) is to use -foreach- as shown below:

            Code:
            foreach i of numlist 0.05(0.05)0.15 {
                di `i'
            }
            This is contraindicated in the manual of -help foreach- because Stata must first expand and store the numeric list, and when there are many equally spaced such numbers, this turns out to be slow. In the case of many, regularly spaced intervals, -forvalues- is the correct approach because each iteration is computed on-the-fly.

            The reason your initial approach stumbles is because of how computers store decimal numbers, which are often never perfectly divisible by 2, and so some level of approximation is required. Let's look at how the values of 0.05, 0.15 and 3*0.05 are stored in hexadecimal notation:

            Code:
            . di %21x .05
            +1.999999999999aX-005
            
            . di %21x .15
            +1.3333333333333X-003
            
            . di %21x 3*.05
            +1.3333333333334X-003
            
            . di %21x `=3*.05'
            +1.3333333333333X-003
            Stata will interpret literal (bare) numbers as double-precision. In hexadecimal representation, the value of 3*0.05 contains ever so slightly more rounding error than does 0.15, as highlighted in red. But the last example forces Stata to first compute 0.15=3*0.5, and then find its hexadecimal representation, show that it matches example #2.

            So what can one do, especially if there are many equally spaced intervals? You can add a small number to the upper limit of your numeric range. The number must be smaller than your increment value or else you have just added another iteration. In this example, you could add 0.001 (say), but it is usually more clear to add the smallest nonzero, positive number (epsilon) that, when added to 1 and stored as a double, does not equal 1. This is a constant in Stata called -c(epsdouble)-.

            Code:
            forval i = 0.05(.05)`=0.15 + c(epsdouble)' {
                di `i'
            }
            Result

            Code:
            . forval i = 0.05(.05)`=0.15 + c(epsdouble)' {
              2.     di `i'
              3. }
            .05
            .1
            .15
            Edit: crossed with #2, #3 and #4.

            Comment

            Working...
            X