Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • standardizing dates across datasets using capture command

    I have a handful of datasets, all with a variable called "date_of_survey", which is a string, and I need to convert them all to date variables. However, some of them are formatted as 2022-03-24 and some are 24mar2022. Rather than pulling each dataset and using a date() command with the proper arguments, I am trying to learn how to use the capture command and the _rc scalar (which is equal to 109 following a type mismatch), to expedite this process. I am trying to execute something like this:
    HTML Code:
    foreach sur in survey1 survey2 survey3 {
       use `sur', clear
       capture gen date = date(date_of_survey, "YMD")
       if c(rc) == 109 {
          gen date = date(date_of_survey, "DMY")
       }
       else if c(rc) == 0 {
          gen date = date(date_of_survey, "YMD")
       }
    }
    I continue to get type mismatch errors when I run the loop. Also, a few of my attempts have resulted in a "date variable already defined" message, which might be because I generate the date with the third line of code and then generate it again in the else if statement? Not sure. I can always just drop the date in the else if statement or leave the statement out altogether, but ultimately, that isn't my main problem.

    Any help would be greatly appreciated! I'm relatively comfortable working with dates, but the capture and c(rc) business is a new (but quite helpful) trick for me.
    Last edited by Brice Karickhoff; 24 Mar 2022, 09:43.

  • #2
    capture gen date = date(date_of_survey, "YMD")
    if c(rc) == 109 {
    Why would you expect generate to issue an error code in this instance? If the date is DMY and you specify YMD, a missing value will be generated. Presumably, you want something like:

    Code:
    input str15 date_of_survey
    "2009-05-27"
    "2012-09-01"
    end
    cap assert !missing(date(date_of_survey[1], "DMY"))
    local format= cond(!c(rc), "DMY", "YMD")
    gen date = date(date_of_survey, "`format'")
    Res.:

    Code:
    . gen date = date(date_of_survey, "`format'")
    
    . l
    
         +--------------------+
         | date_of_~y    date |
         |--------------------|
      1. | 2009-05-27   18044 |
      2. | 2012-09-01   19237 |
         +--------------------+
    I continue to get type mismatch errors when I run the loop.
    The type mismatch errors often relate to mixing strings and reals. The date function expects a string input.

    Comment

    Working...
    X