Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting month and year from a date variable

    Hello Statalist,

    I imported an excel dataset that had a timestamp variable (de_date), and Stata seems to have read it as a date variable: the type is double and the format is %tcnn/dd/ccYY_hh:MM. I am trying to extract the year and month, but all that results in is blank observations. I'm sure I'm missing an easy yet crucial step.

    my code:
    gen de_month = month(de_date)
    gen de_year = year(de_date)

    Thanks in advance for your help.

  • #2
    Jolene,

    For reasons that are not entirely clear, the Stata month() and year() functions do not work with datetime variables, just date variables (you have a datetime variable). To solve that problem you can convert the datetime variable to a date variable:

    Code:
    gen de_month = month(dofc(de_date))
    gen de_year = year(dofc(de_date))
    Alternatively, you can use the dofc() function to create a new date variable and use that thereafter.

    Regards,
    Joe

    Comment


    • #3
      month() and year() are for extracting month and year from daily dates, including daily date variables. Any other application would require each function to figure out what the user intends, in this case that conversion to daily dates from clock date-times is needed too, which is not good design for a function. That was the original intention and it remains unchanged by the introduction of datetimes in Stata 11.

      However, StataCorp sets a bad example, IMO, by repeatedly referring to date variables, tout court, as if the qualifier "daily" were redundant. The defence for this would, I imagine, be twofold:

      1. "date" in common parlance usually, or most frequently, means "daily date".

      2. Historically daily dates were the first kind of date to be given special treatment in Stata.

      Against that, I don't think it's fair to expect new users to know the history of Stata or older users to remember it. Further, who can be confident what is evident from an English word? Much confusion would be avoided if StataCorp were always to specify "daily date", not "date", when that is meant. I suspect that often users are imagining that date variable is a generic term and that functions and commands adjust according to the kind of date involved.

      A related point is that in calculations (as opposed to presentations) the display format is typically (*) ignored by Stata. How things are displayed is cosmetic only, just as I am not defined by wearing a reddish jumper at this instant. That is, at least in this case, Stata will pay absolutely no attention to the display format %tc.

      (*) The exceptions don't affect the story here.

      Comment

      Working...
      X