Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Categorizing time into time-of-day categories

    Hi. I have a time variable in a 24-hour HH:MM format and want to categorize it as - morning: <12:00; afternoon >=12:00 & <17:00; and, evening: >=17:00. However, my code below fails. Any help in doing so will be much appreciated

    Code:
    *code
    gen byte timeofday=.
    replace timeofday=1 if appt_time_of_day<12:00
    replace timeofday=2 if appt_time_of_day>=12 & appt_time_of_day<17
    replace timeofday=2 if appt_time_of_day>=17 
        
    
    *Data
    input double appt_time_of_day
    30600000
    32400000
    43200000
    53700000
    47700000
    29700000
    38700000
    35100000
    33300000
    39600000
    42300000
    37800000
    41400000
    34200000
    36900000
    28800000
    45000000
    36000000
    40500000
    35100000
    37800000
    39900000
    36000000
    42600000
    34200000
    36900000
    51300000
    50400000
    31500000
    40200000
    29700000
    28800000
    47700000
    52200000
    48600000

  • #2
    Despite its variable name, your variable appt_time_of_day is a date-time with units ms (milliseconds) measured from the start of 1960.

    This and almost everything else you need to know here follows from reading
    Code:
    help datetime
    I often just work from first principles here, but there are convenience functions documented there that cut down on typing.

    I added one more observation to your data example for a reason that will become apparent.

    Code:
    clear 
    input double appt_time_of_day
    30600000
    32400000
    43200000
    53700000
    47700000
    29700000
    38700000
    35100000
    33300000
    39600000
    42300000
    37800000
    41400000
    34200000
    36900000
    28800000
    45000000
    36000000
    40500000
    35100000
    37800000
    39900000
    36000000
    42600000
    34200000
    36900000
    51300000
    50400000
    31500000
    40200000
    29700000
    28800000
    47700000
    52200000
    48600000
    end 
    
    insobs 1
    replace appt_time_of_day = clock("22:00:00", "hms") in L 
    
    
    gen double time_of_day_ms = mod(appt_time_of_day, 24 * 60 * 60000) 
    
    format *time* %tcHH:MM:SS 
    
    gen wanted = cond(time_of_day_ms < 12 * 60 * 60000, 1, cond(time_of_day_ms < 17 * 60 * 60000, 2, 3)) if time_of_day_ms < . 
    
    gen WANTED = cond(hh(appt_time_of_day) < 12, 1, cond(hh(appt_time_of_day) < 17, 2, 3)) if time_of_day_ms < . 
    
    sort time_of_day_ms 
    
    list, sepby(wanted)
    
     
    
    
         +---------------------------------------+
         | appt_t~y   time_o~s   wanted   WANTED |
         |---------------------------------------|
      1. | 08:00:00   08:00:00        1        1 |
      2. | 08:00:00   08:00:00        1        1 |
      3. | 08:15:00   08:15:00        1        1 |
      4. | 08:15:00   08:15:00        1        1 |
      5. | 08:30:00   08:30:00        1        1 |
      6. | 08:45:00   08:45:00        1        1 |
      7. | 09:00:00   09:00:00        1        1 |
      8. | 09:15:00   09:15:00        1        1 |
      9. | 09:30:00   09:30:00        1        1 |
     10. | 09:30:00   09:30:00        1        1 |
     11. | 09:45:00   09:45:00        1        1 |
     12. | 09:45:00   09:45:00        1        1 |
     13. | 10:00:00   10:00:00        1        1 |
     14. | 10:00:00   10:00:00        1        1 |
     15. | 10:15:00   10:15:00        1        1 |
     16. | 10:15:00   10:15:00        1        1 |
     17. | 10:30:00   10:30:00        1        1 |
     18. | 10:30:00   10:30:00        1        1 |
     19. | 10:45:00   10:45:00        1        1 |
     20. | 11:00:00   11:00:00        1        1 |
     21. | 11:05:00   11:05:00        1        1 |
     22. | 11:10:00   11:10:00        1        1 |
     23. | 11:15:00   11:15:00        1        1 |
     24. | 11:30:00   11:30:00        1        1 |
     25. | 11:45:00   11:45:00        1        1 |
     26. | 11:50:00   11:50:00        1        1 |
         |---------------------------------------|
     27. | 12:00:00   12:00:00        2        2 |
     28. | 12:30:00   12:30:00        2        2 |
     29. | 13:15:00   13:15:00        2        2 |
     30. | 13:15:00   13:15:00        2        2 |
     31. | 13:30:00   13:30:00        2        2 |
     32. | 14:00:00   14:00:00        2        2 |
     33. | 14:15:00   14:15:00        2        2 |
     34. | 14:30:00   14:30:00        2        2 |
     35. | 14:55:00   14:55:00        2        2 |
         |---------------------------------------|
     36. | 22:00:00   22:00:00        3        3 |
         +---------------------------------------+
    .
    Your code failed for these reasons:

    0. Possibly you are being misled by someone's poor choice of variable name.

    1. Stata can't divine your intent of extracting time of day information without explicit syntax.

    2. In any case comparing units of hours and units of ms needs a conversion factor or functions or commands devised for the purpose.

    3. Colon separators within numeric values are illegal.

    I hope that helps.

    Comment


    • #3
      Since the cutpoints you want to use to divide the time of day into these three parts are all exactly on an hour, the easiest way to do this is:
      Code:
      gen timeofday = 1 if hh(appt_time_of_day) < 12
      replace timeofday = 2 if inrange(hh(appt_time_of_day), 12, 16)
      replace timeofday = 3 if hh(appt_time_of_day) >= 17
      Alternatively, an approach that could be used with cutpoints that don't exactly fall on an hour, and is more in the spirit of what you tried to do:
      Code:
      gen timeofday = 1 if appt_time_of_day < tc(12:00)
      replace timeofday = 2 if appt_time_of_day >= tc(12:00) & appt_time_of_day < tc(17:00)
      replace timeofday = 3 if appt_time_of_day >= tc(17:00)

      Your code doesn't work because the numerical representations of 12:00 and 17:00 in Stata are not 12 and 17, and because, by itself, 12:00 is not legal syntax in your first -replace- command. When working with specified dates and times, although it is possible to refer to them in code using the actual numerical representations (which, for 12:00 and 17:00 are 43200000, and 61200000, respectively), it is difficult to do so. It requires calculating or knowing the numbers needed, and the resulting code is, in any case, unreadable to people who are not immediately able to translate those numbers back to dates and times.) So it is always better to refer to specific dates and times using Stata's numerous datetime functions, which are easily understandable and accessible to humans but provide the correct numerical values to Stata.

      Comment


      • #4
        On this occasion it seems that I posted about 30000 ms before Clyde Schechter

        Comment


        • #5
          Clyde Schechter and Nick Cox - Thank you, both. These greatly helped me understand the date-time functions in stata

          Comment

          Working...
          X