Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why does this happen?

    pid and pid2 should be identical, but it is not....
    Could you explain why it happens?

    Code:
    import delimited "https://raw.githubusercontent.com/jayjeo/public/master/EarlyRetirementsUSA/question.csv", varnames(1) clear
    format pid %20.3f
    gen pid2=pid
    sort pid
    keep pid pid2
    format pid2 %20.3f

  • #2
    Try
    Code:
    generate double pid2 = pid
    assert pid2 == pid
    For the second time today: Which of the other major statistical software providers (including noncommercial, e.g., R, Python, Julia) has not made double precision the default numerical data type?

    Comment


    • #3
      Originally posted by Joseph Coveney View Post
      Try
      Code:
      generate double pid2 = pid
      assert pid2 == pid
      For the second time today: Which of the other major statistical software providers (including noncommercial, e.g., R, Python, Julia) has not made double precision the default numerical data type?
      Your suggestion works!
      So... basically double and float cannot be used together? This is really weird...

      Comment


      • #4
        Originally posted by Jay Jeong View Post
        So... basically double and float cannot be used together? This is really weird...
        It's not so weird when you consider that the data are pretty big integers and so were imported as double precision. (Check pid's datatype with describe.) When you try to copy numbers that require double precision into a variable that by default is only single precision, you'll end up in trouble.

        Depending upon what your objective is, you might want to consider importing and working with long ID-type data as strings.
        Code:
        import delimited "https://raw.githubusercontent.com/jayjeo/public/master/EarlyRetirementsUSA/question.csv", ///
            varnames(1) stringcols(1) clear
        generate pid2 = pid // optionally specify -str- or even -strL- as datatype
        assert pid2 == pid

        Comment


        • #5
          Are these really identifiers? Why 3 decimal places?

          Comment


          • #6
            Originally posted by Joseph Coveney View Post
            It's not so weird when you consider that the data are pretty big integers and so were imported as double precision. (Check pid's datatype with describe.) When you try to copy numbers that require double precision into a variable that by default is only single precision, you'll end up in trouble.
            Yes, that is what I was saying. So calculation by mixing double precision and float precision is not possible? That sounds absurd... Is it really true?

            Comment


            • #7
              Originally posted by Nick Cox View Post
              Are these really identifiers? Why 3 decimal places?
              Yes, these are CPSID, a person id record, from CPS-IPUMS data in the USA.
              There is no reason for 3 decimal places. It was an accident.

              Best regards,
              Jay Jeong.

              Comment

              Working...
              X