Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unifying and formatting multiple date formats

    Hi all,
    I always have trouble with dates in Stata and I am wondering if someone can help me unify them and create variables for month, day, year, hour, and minute.

    Here are examples of the multiple formats--

    2019-10-09 23:09:01
    5/16/2021 21:52
    Tue, 27 Apr 2021 09:28:55

    Here is the code I have been trying but does not work--

    gen date_obs = clock(time, "MD20Yhm")
    replace date_obs = clock(time, "MD20Yhm")
    format date_obs %16.0f
    gen month = month(dofc(date_obs))
    gen day = day(dofc(date_obs))
    gen year = year(dofc(date_obs))
    gen hour = hh(date_obs)
    gen minute = mm(date_obs)
    gen sec = ss(date_obs)
    tab month year
    tab hour
    sort time

    Any help would be much appreciated! Thank you!

  • #2
    There is no white magic beyond having code that copes with all the formats you have. You're going to need code that copes with all the formats you have, with a simple strategy that if a pattern doesn't work. you try another.


    Code:
    clear 
    input str42 probdate 
    "2019-10-09 23:09:01"
    "5/16/2021 21:52"
    "Tue, 27 Apr 2021 09:28:55"
    end 
    
    gen time = word(probdate, -1)
    
    gen sdate  = trim(subinstr(probdate, time, "", .)) 
    
    foreach dow in `c(Wdays)' `c(Weekdays)' { 
        replace sdate = subinstr(sdate, "`dow'", "", .)
    }
    
    replace sdate = subinstr(sdate, ",", "", .) 
    
    gen ndate = daily(sdate, "YMD") 
    replace ndate = daily(sdate, "MDY") if missing(ndate)
    replace ndate = daily(sdate, "DMY") if missing(ndate)
    
    format ndate %td 
    
    gen double datetime = ndate * 60000 * 60 * 24 + clock(time, "hms")
    replace datetime = ndate * 60000 * 60 * 24 + clock(time, "hm") if missing(datetime)
    
    format datetime %tc 
    
    list time-datetime
    
         +----------------------------------------------------------+
         |     time          sdate       ndate             datetime |
         |----------------------------------------------------------|
      1. | 23:09:01     2019-10-09   09oct2019   09oct2019 23:09:01 |
      2. |    21:52      5/16/2021   16may2021   16may2021 21:52:00 |
      3. | 09:28:55    27 Apr 2021   27apr2021   27apr2021 09:28:55 |
         +----------------------------------------------------------+

    Comment

    Working...
    X