Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to set the storage of number automatically to be double rather than float?

    how to set the storage of number automatically to be double rather than float?
    When using float, sometimes, 1.1 will becomes 1.10000000001, causing some problems. For example, a and b below should be exactly the same. how to make b equal a? (round function for b doesn't work)
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double(a b)
    85.3 85.30000000000001
    83.3 83.30000000000001
    end
    Last edited by Fred Lee; 08 Jan 2022, 02:00.

  • #2
    Fred:
    Code:
    . format b %12.1f
    
    . list
    
         +-------------+
         |    a      b |
         |-------------|
      1. | 85.3   85.3 |
      2. | 83.3   83.3 |
         +-------------+
    
    .
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      if you actually do want your variables to be double, use -set type double-; this can be done for particular data sets or for all; see
      Code:
      help generate

      Comment


      • #4
        The question is a little confused. First, nothing whatever will make Stata store the decimal 1.1 as the exact binary equivalent of 1.1 because there is no such (finite) binary equivalent. of 1.1. Much the same applies generally, as resources found by

        Code:
        search precision
        explain at great length.

        Also the assertion that sometimes 1.1 becomes 1.10000000001 makes no sense without a context, but a best guess is that you are comparing the same thing displayed with different formats, or comparing different things that you expect to be the same, but are different for other reasons.

        Backing up:

        All a double can do given most numbers with fractional parts is yield better approximations.

        Code:
        . clear
        
        . set obs 1
        Number of observations (_N) was 0, now 1.
        
        . gen float foo_float = 1.1
        
        . gen double foo_double = 1.1
        
        . format foo* %23.18f
        
        . l
        
             +---------------------------------------------+
             |            foo_float             foo_double |
             |---------------------------------------------|
          1. | 1.100000023841857910   1.100000000000000089 |
             +---------------------------------------------+
        
        . format foo* %21x
        
        . l
        
             +-----------------------------------------------+
             |             foo_float              foo_double |
             |-----------------------------------------------|
          1. | +1.19999a0000000X+000   +1.199999999999aX+000 |
             +-----------------------------------------------+
        That said, if you want to see e.g. 1.1 in displays then as Carlo Lazzaro explains that is a matter of display format.

        Otherwise -- to expand on Carlo's other point ---

        Code:
        set type double
        is seemingly what you seek, as

        Code:
        help generate
        explains.

        There's a longstanding ding-dong debate among users about this. Some think double should be the default for numeric variable types; others think that this is usually wasteful of storage and is rarely justifiable scientifically. I am in the second camp. The StataCorp position, as just implied, is that you can choose double as your own default by all means, but the default default is float. The context is that while even limited laptop computers have more and more memory, so also datasets are often getting bigger and bigger too.


        The question needs a cross-reference to the concurrent thread https://www.statalist.org/forums/for...-are-different, which raises related but different issues.
        Last edited by Nick Cox; 08 Jan 2022, 07:17.

        Comment


        • #5
          Nick Cox Thnaks, Nick! As you said, the related thread is more clear, https://www.statalist.org/forums/for...-are-different

          In that post, the values of b and a are expected the same, I need to compare a and b for further usage
          Code:
          gen c= a==b
          How to deal with b to make it exactly the same with a? For example, the round function doesn't wotk.
          Thanks a ton!

          Comment


          • #6
            I didn't say the other thread was clearer. In fact I think it's even harder to follow what you seek there than in this one.

            I have posted in both threads now and don't think I can add any more. If you keep asking the same question without adding context, then sorry, but the answers will remain the same.

            See the other thread for a comment about round().

            Comment


            • #7
              There is already a lot of good information here despite the confusion. If it is important to assert equality where precision issues may arise, Stata will always internally convert numbers to double precision before performing calculations. You may use the -float()- function to instead use reduced, float precision for those numbers. In this way, you can assert equality up to said level of precision. This strategy can also be used to check equality up to some (small) number of arbitrary decimal places.

              You should really really consider whether this is important to do at all. Your other thread says you are trying to check Excel calculations with those of Stata. This suggests you trust Excel to hold your raw data, but don't trust it to perform a calculation, which are contradictory beliefs. If you trust Excel, there should be no reason to check or replace the calculation, or else you should perform all calculations in Stata and disregard those from Excel. I don't see what is to be gained by comparing Excel results to Stata results to an (unrealistically) large degree of precision.

              Code:
              input double(a b) float c
              1.1 1.10000000000001 1.1
              3.3465433535931 3.3465433535931 3.3465433535931
              end
              list
              
              // values are not equal -- failures
              cap noi assert a==b
              cap noi assert a==c
              cap noi assert b==c
              
              // values are equal to float precision -- successes
              assert float(a)==float(b)
              assert float(a)==float(c)
              assert float(b)==float(c)
              
              // values are equal to rounded digit within float precision -- successes
              assert round(float(a), 0.001) == round(float(b), 0.001)
              assert round(float(a), 0.001) == round(float(c), 0.001)
              assert round(float(b), 0.001) == round(float(c), 0.001)

              Comment


              • #8
                Great, Leonardo Guizzetti, thanks so much for your comments! This "float(a)==float(b)" solved my problem.

                Why I need to compare the calculations of Excel and Stata is that Excel data are raw data, and someone modified the sum, some sum are wrong. Therefore I want to check the sum again in Stata, then I can modify the Excel raw data back to correct one. The precision issue cause that when I browse data "a~=b", a lot of observations appear, impeding me to locate the wrong caculations of sum in Excel.

                Thanks, again! Your suggestion solved my problem!

                Comment

                Working...
                X