Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • elapsed time between two dates based on clock(time1, "YMD hms")

    Dear reader,

    I've been struggling for months to find a solution to this kind of simple issue. Now I thought clock() was it. Well, it seems, it's not.
    goal:
    calculate difference of elapsed time (preferably in minutes) between two dates (startdate, datestamp - like "2016-12-14 19:04:35").

    method:
    Code:
    clear
    set obs 5
    gen id = _n
    
    //create date strings
    gen str20 startdate = ""
    gen str20 datestamp = ""
    
    //interview start
    replace startdate = "2016-12-14 19:04:35" if id == 1
    replace startdate = "2016-11-08 16:40:04" if id == 2
    replace startdate = "2016-11-24 19:47:30" if id == 3
    replace startdate = "2016-11-15 14:15:46" if id == 4
    replace startdate = "2016-11-25 21:01:22" if id == 5
    
    //interview end
    replace datestamp = "2016-12-14 19:04:53" if id == 1
    replace datestamp = "2016-11-08 16:40:17" if id == 2
    replace datestamp = "2016-11-24 19:48:03" if id == 3
    replace datestamp = "2016-11-15 14:16:19" if id == 4
    replace datestamp = "2016-11-25 21:04:54" if id == 5
    
    //convert to numeric vars
    gen startnum = clock(startdate, "YMD hms")
    fre startnum
    gen endnum = clock(datestamp, "YMD hms")
    fre endnum
    
    //calculate the actually wanted difference
    gen timediff = endnum - startnum
    fre timediff
    format timediff %tc
    list id timediff
    result:
    id 1-3 with timediff = 0 though there is a difference of some seconds!!!
    id 4 scores better (> 0), but not correct, nonetheless (actually < 1 min instead of 02:11 min)!!!
    id 5 should score 03:32 min, not just 02:11 min!!!

    Some idea, what's wrong?

    Thanks a lot for tips on that
    Franz
    Thank you for reading (and some reply)
    Using Stata 16.1
    Extractions (-dataex-) of the data I'm working with is impossible, sorry!

  • #2
    Nick Cox will probably chime in here at some point as this is related to one of his favourite/most hated topics: precision.

    By default, Stata stores numbers as floats (~7 digits). However, the number of elapsed milliseconds since 1960 Jan 1st 0:00 (I think?) is quite hard to capture in such a format. That's why dates and times should always be stored as doubles (~15 digits). You can do this by starting your dofile with set type double, or simply adding double between gen and the variable name.

    Code:
    gen double start = clock(startdate, "YMDhms")
    gen double end = clock(datestamp,"YMDhms")
    
    format start end %tc
    
    gen double timeElapsed = end - start
    gen double timeElapsedSeconds = timeElapsed/1000
    The code below gives me a time elapsed of resp. 18, 13, 33, 33 and 212 seconds. The same code without the double precision returns 0, 0, 0, 131.xx and 131.xx. Note also that the displayed start and end times are correct when you specify double, but are only approximately correct when you do not.

    Comment


    • #3
      so great, so easy. thank you Jesse (and Nick for insisting in precision - I thought, that much of it wouldn't be necessary at this point and Stata choses the right one out of the different numeric types)!! lesson learned
      Thank you for reading (and some reply)
      Using Stata 16.1
      Extractions (-dataex-) of the data I'm working with is impossible, sorry!

      Comment


      • #4
        I too have fallen for the float trap many times... (once I even merged completely unrelated firms to each other because their firm id had become identical due to precision errors)

        Comment


        • #5
          firm id had become identical due to precision errors
          ok, I see, there's some kernel of evil (in the imprecision, ugh)
          Thank you for reading (and some reply)
          Using Stata 16.1
          Extractions (-dataex-) of the data I'm working with is impossible, sorry!

          Comment

          Working...
          X