Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Conversion of literally written dates into date format

    Conversion of literally written dates into date format
    Am using STATA 16 on Windows.

    I would like to convert

    date
    DEC-06-2018
    11TH MAY 2019
    23RD JANUARY 2018

    into date format in STATA

    I tried to 1s use
    gen date2 = subinstr(date, "JANUARY", "JAN",.)
    replace date2 = subinstr(date, "FEBRUARY", "FEB",.)
    to shorten the months

    but the 1st works but the second reverts JANUARY to JAN

    and I have no clue as how to drop the TH or RD from 11th or 23rd respectively

  • #2
    The reversion of JANUARY to JAN with your second command is because you told Stata to -replace- date2 with something calculated from date. What you need to do is
    Code:
    replace date2 = subinstr(date2, "FEBRUARY", "FEB", .)
    That said, I don't quite get why you are trying to do this. Ultimately, dates represented as string variables in Stata are pretty much useless. If you are going to do any date calculations, or any operations that require chronological ordering of the data, then you need to convert the string into a Stata internal format numeric date variable. To do that, there is no need to abbreviate JANUARY to JAN. The -daily()- function that carries out this function works equally well with full month names and with the standard abbreviations. The only obstacle to the conversion here is the occurrence of ordinal suffixes ("TH", "RD") on the day portion. So I would handler the whole thing this way:
    Code:
    // REMOVE ORDINAL SUFFIXES
    foreach os in ST ND RD TH {
        replace date = subinstr(date, "`os'", "", .)
    }
    
    // CREATE NUMERIC DATE VARIABLE
    gen _date = daily(date, "MDY")
    replace _date = daily(date, "DMY") if missing(_date)
    format _date %td
    drop date
    rename _date date
    list, noobs clean
    In the future, when showing data examples, please use the -dataex- command to do so, a I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      This works for your data example. You may need more cleaning for your full dataset.

      Code:
      lear
      input str42 messydate
      "DEC-06-2018"
      "11TH MAY 2019"
      "23RD JANUARY 2018"
      end
      
      foreach bad in st nd rd th {
          replace messydate = subinstr(lower(messydate), "`bad'", "", .)
      }
      
      gen ddate = daily(messydate, "MDY")
      replace ddate = daily(messydate, "DMY") if missing(ddate)
      
      format ddate %td
      
      list
      
           +-----------------------------+
           |       messydate       ddate |
           |-----------------------------|
        1. |     dec-06-2018   06dec2018 |
        2. |     11 may 2019   11may2019 |
        3. | 23 january 2018   23jan2018 |
           +-----------------------------+
      PS See #2

      If he's Clyde, am I Bonnie? [works better in Scottish]

      Comment

      Working...
      X