Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bug with strofreal()

    Hello,

    I have encountered an issue while trying to convert reals to strings.

    Here is my initial issue: I have a program that stores different values of a variable, does some stuff in Mata and uses these values in a Stata command.
    • I have the exact value as a real in Mata
    • I convert it as a string using the strofreal() function
    • I concatenate my Stata command with the sprintf() function
    I know that floats have a precision issue due to the underlying base 2 of the computer. I tested and deduced that Stata/Mata considers two values equal if the first 15 decimals are the same.
    Code:
    1.0  + 10^-15  == 1.0     // false
    1.0  + 10^-16  == 1.0     // true
    0.1  + 10^-17  == 0.1     // false
    0.1  + 10^-18  == 0.1     // true
    0.01  + 10^-17  == 0.01   // false
    0.01  + 10^-18  == 0.01   // true
    When there is no leading digit, I deduced that the closest value to 0 is 10^-307
    Code:
    10^-307 == 0        // false
    10^-308 == 0        // true
    The next step was to check what format in strofreal() allowed me to get back the right value after the conversion.
    Code:
    strtoreal(strofreal(10^-307, "%22.0g")) == 10^-307    // false
    strtoreal(strofreal(10^-307, "%23.0g")) == 10^-307    // true
    But after brute forcing it, it seems that a width of 24 handles most of the cases
    Code:
    for (i = -308; i <= 308; i++) {
        x_num = 10^i
        x_str = strofreal(x_num, "%24.0g")
        x_back = strtoreal(x_str)
        are_same = (x_back == x_num)
        if (are_same == 0) {
            i
            x_num
            x_str
            x_back
            x_num - x_back
        }
    }
    But there is still a difference for 10^19 and 10^20 and no width seem to correct it.

    Worse:
    • this does not happen on every run
    • other values do not work anymore with larger widths such as 10^21, 10^22, 10^23
    When I look at the result for 10^19 for example, it does not make sense
    Code:
    strofreal(10^19, "%23.0g")    //    1.0000000000000000e+19
    strofreal(10^19, "%24.0g")    //    1000000000000000000.\x17
    strofreal(10^19, "%25.0g")    //    1000000000000000000.\x17
    strofreal(10^19, "%26.0g")    //    1000000000000000000.\x17
    
    sprintf("%s", strofreal(10^19, "%23.0g"))    //    1.0000000000000000e+19
    sprintf("%s", strofreal(10^19, "%24.0g"))    //    1.00000000000000000e+19
    sprintf("%s", strofreal(10^19, "%25.0g"))    //    1.000000000000000000e+19
    sprintf("%s", strofreal(10^19, "%26.0g"))    //    1000000000000000000.8\x0c\x1c
    Hence, I have few questions:
    1. Does my approach to inject back the value in the Stata command makes sense ?
    2. Is there a known width that would suffice ?
    3. What is the issue with 10^19 and 10^20 ?
    4. How can I handle it ?
    Best regards,
    Mael Astruc--Le Souder

  • #2
    I have not read everything so I am not even sure why you need the conversion from numeric to string. You could probably store the values in a (temporary) scalar and pass that to Stata. Anyway, the solution to your problems is %21x (there is a second part to that blog entry).

    Code:
    for (i = -308; i <= 308; i++) {
    x_num = 10^i
    x_str = strofreal(x_num, "%21x") // <- use %21x format here
    x_back = strtoreal(x_str)
    are_same = (x_back == x_num)
    if (are_same == 0) {
    i
    x_num
    x_str
    x_back
    x_num - x_back
    }
    }
    Last edited by daniel klein; 07 Aug 2024, 14:19.

    Comment


    • #3
      Thank you very much, it seems I skipped this part of the manual.

      I tried with "%21x" and it works perfectly. The blog post was also very instructive, thanks for sharing it.

      I still don't understand these weird formatting issue with 10^19, but if it works, that's good enough for me.

      Comment


      • #4
        Mael Astruc The largest integer that can be represented by a double is 2^53, and you can see here:

        Code:
        strofreal(2^53,     "%24.0g") // correct
        strofreal(2^53 + 1, "%24.0g") // wrong
        strofreal(2^53 + 2, "%24.0g") // correct
        strofreal(2^53 + 3, "%24.0g") // wrong
        I will grant you that it's odd 19 and 20 in particular encounter this bug, so there's probably something off in strofreal that's causing the problem. However, at those magnitudes, you shouldn't expect the conversion to work. %21x stores the underlying representation of the number, so it's lossless. Btw, when you noted

        Originally posted by Mael Astruc View Post
        I know that floats have a precision issue due to the underlying base 2 of the computer. I tested and deduced that Stata/Mata considers two values equal if the first 15 decimals are the same.
        you're close to being right. The so-called "machine epsilon" for a given number is 2^-53, so two numbers can be off by twice that and potentially be "the same" in the eyes of the computer. This is 2e-16.

        Comment


        • #5
          Mauricio Caceres thank you for your response. I saw that the precision was defined by the 53 bits, but I didn't connect the dots with the 10e-16, I thought 2^-53 was much larger. Thank you for highlighting it, everything makes much more sense now.

          The "%g" format approach was flawed from the start, I will stick to the "%21x" for now. Ideally, I will rewrite the code using daniel klein's suggestion to create a temporary scalar, but right now my issue is that I have to re-inject an unknown number of scalars into the command, which is a bit tricky and will require a deeper rewrite using a vector and indices.

          Comment

          Working...
          X