Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Convert string variable to a date variable

    Dear Stata Users

    I have a string variable yearmonth: for Jan 2004 it's stored as "200401". I need to convert it to a readable year/month format. I have used the following code:

    Code:
    generate month = date( yearmonth ,"YM")
    format %tm month
    The code above produces strange date variable.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 yearmonth
    "200401"
    "200402"
    "200403"
    "200404"
    "200405"
    "200406"
    "200407"
    "200408"
    "200409"
    "200410"
    "200411"
    "200412"
    end
    How to alter a code to get a meaningful year/month variable?
    Thank you.

  • #2
    Dear Olena, You can use Nick's (ssc install) numdate command.
    Code:
    numdate monthly ym = yearmonth, p(YM)
    But, I am curious why the following code doesn't work?
    Code:
    gen ym = monthly(yearmonth, "YM")
    format ym %tm
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

    Comment


    • #3
      River

      Thank you. numdate resolved the problem.

      Comment


      • #4
        But, I am curious why the following code doesn't work?
        Code:
        gen ym = monthly(yearmonth, "YM")
        format ym %tm
        That is an excellent question. The definitive discussion would be in the output of\ help datetime_translation and of course the linked PDF documentation. That tells us
        ​​​​​​​
        Translating run-together dates, such as 20060125

        The translation functions will translate dates and times that are run together, such as
        20060125, 060125, and 20060125110215 (which is 25jan2006 11:02:15). You do not have to do
        anything special to translate them:

        . display %d date("20060125", "YMD")
        25jan2006

        . display %td date("060125", "20YMD")
        25jan2006

        . display %tc clock("20060125110215", "YMDhms")
        25jan2006 11:02:15
        Yet with that said
        Code:
        . generate sdate = "200401"
        
        . generate date = monthly(sdate,"YM")
        (1 missing value generated)
        
        . replace date = mofd(daily(sdate+"01","YMD"))
        (1 real change made)
        
        . format date %tm
        
        . list
        
             +-----------------+
             |  sdate     date |
             |-----------------|
          1. | 200401   2004m1 |
             +-----------------+
        I would have expected both to work, or neither.

        To me this seems to be a bug.

        Comment


        • #5
          Dear William, Thanks for the explanation/example.
          Ho-Chuan (River) Huang
          Stata 19.0, MP(4)

          Comment


          • #6
            monthly() is not as smart as daily(), as often mentioned here.

            Note that the central fallacy in #1 is using date() -- which is explicitly for producing daily dates, whereas monthly dates are desired here.

            I wish StataCorp would undocument date() in favour of daily(). It is too easy to imagine that date() is a generic function for producing any kind of date.

            Or, date() is made truly general with three arguments allowed.

            date(input string, format [, desired date kind if not daily date])

            which is what numdate is trying to do with command syntax.

            Comment


            • #7
              Originally posted by Nick Cox View Post
              monthly() is not as smart as daily(), as often mentioned here.
              Agreed, and that was going to be my response in post #4, but when I looked for documentation to support that assertion, I instead found the material in help datetime translation which suggested otherwise.

              The translation functions will translate dates and times that are run together, ...
              I have suggested to Stata Technical Services the following.

              That is at best misleading, although I would consider "misleading documentation" to be a bug - the software may work as intended, but for the user, it does not do with it is documented to do.
              • By lack of any qualification, "The translation functions" (note the plural) suggests to the reader "all" translation functions, not "Some translation functions".
              • Common usage in English is that a "monthly date" is a "date". Indeed, -help datetime- tells us a "monthly date" is a "type of date", and it also tells us "date" is a "type of date".
              Is it any wonder the reader looking for help is confused as to whether a run-together "date" refers to a representation of any "type of date" or instead is limited to a "date" "type of date"?

              The documentation for run-together values should be clarified - this is an ongoing source of confusion on Statalist.
              "The translation functions date(), daily(), clock(), and Clock() will translate dates and times that are run together, .... Other translation functions do not do so."
              Of course Nick will recognize this as in part a point he has long since raised with StataCorp. With "yearly date" and "monthly date" etc. available it makes no sense - in the English language - to have the term "date" used for something other that the superset of all types of dates. Yet reading their documentation I can say that Stata tells us "date is a particular type of date" - which I think risks running afoul of Russel's paradox.

              The right solution, as Nick has often pointed out here, is to qualify "date" as "daily date" and preferably to change documentation examples - at least for help datetime and its derivatives - to use daily() where they now use date(). Then they could write "The translation functions will translate daily dates and times that are run together, ..." although I think it best to explicitly list those that do.

              Comment

              Working...
              X