Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 🐛 Stata silently failing on conversion of timestamps

    I am working with data generated using high-resolution clocks. It has duration of certain events measured with fractions of seconds.
    I find that when converting this to time format Stata produces a missing value where I do not expect it.

    The following minimal example illustrates what is going on:
    Code:
    clear all
    version 15.0
    
    display clock("19:46:01","hms")
    display clock("19:46:01.1","hms")
    display clock("19:46:01.12","hms")
    display clock("19:46:01.123","hms")
    display clock("19:46:01.1234","hms")
    display clock("19:46:01.12345","hms")
    Results in the following execution log on Stata for Windows v15.0:
    Code:
    . clear all
    
    . version 15.0
    
    . 
    . display clock("19:46:01","hms")
    71161000
    
    . display clock("19:46:01.1","hms")
    71161100
    
    . display clock("19:46:01.12","hms")
    71161120
    
    . display clock("19:46:01.123","hms")
    71161123
    
    . display clock("19:46:01.1234","hms")
    .
    
    . display clock("19:46:01.12345","hms")
    .
    Milliseconds are more than enough for my analysis, so the storage type is fine. But I believe if Stata can't handle precision beyond a millisecond it should round up the value prior to the conversion rather than failing the whole value to a missing.


    Thank you, Sergiy

  • #2
    I don't know what the discussion was at StataCorp about what should happen. But I would lean the other way: if input finer than milliseconds is allowed, I then would expect rounding down, which is what digital displays do in general, or so I believe.

    Either way, Stata's silent hint is that you need to decide on your own rounding for these stamps to be acceptable.

    Comment


    • #3
      Yes, and now that I know what to beware of, I will do the rounding. But this will be a surprise for a novice user who will try to process high-frequency or high accuracy data.
      (see a somewhat related discussion here)

      We have seen a lot of questions about precision already, like in the following code:
      Code:
      clear all
      version 15.0
      
      set obs 1
      generate t=clock("2001-10-10 19:46:06","YMDhms")
      format t %tc
      list
      
      recast double t
      replace t=clock("2001-10-10 19:46:06","YMDhms")
      list

      Results in the following output:
      Code:
      . list
      
           +--------------------+
           |                  t |
           |--------------------|
        1. | 10oct2001 19:46:55 |
           +--------------------+
      
      . 
      . recast double t
      
      . replace t=clock("2001-10-10 19:46:06","YMDhms")
      (1 real change made)
      
      . list
      
           +--------------------+
           |                  t |
           |--------------------|
        1. | 10oct2001 19:46:06 |
           +--------------------+
      But in the above the clock function knows nothing about where the result will be put, but in the case I am talking about it knows both of its arguments and should be able to decide how to make most use of its inputs.

      Best, Sergiy

      Comment


      • #4
        it knows both of its arguments and should be able to decide how to make most use of its inputs.
        where preferably "make most use" coincides with the user's expectations? My preference is the missing value, rather than to allow me to think that the difference between two SIF times converted to SIF from strings with microseconds will be accurate to the microsecond. I think datetimes have a special place in precision discussions, because their values are of interest as differences more so than as a specific instant in history, and it behooves Stata to be sensitive to that.

        I could see an optional fourth argument to clock() - "round", "floor", "ceil", "missing" - instructing it precisely how to make the most of of its input if that input is a timestamp more precise that datetime values support. That would be a real improvement over having to hack the string yourself. I suppose someone could even write such a function.
        Last edited by William Lisowski; 05 Nov 2018, 13:54.

        Comment

        Working...
        X