Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • transforming date variable

    Hello! I am working with a dataset that includes a date variable with the data stored as a string variable in the following format: MM-DD-YYYY 0:00. I am interested in dropping the timestamp and converting to a numeric variable that will allow me to sort the date fields. Can anyone share code or helpful tips to assist with this process? Thanks!

  • #2
    Code:
    gen wanted = daily(substr(string_date_variable, 1, 10), "MDY")
    format wanted %td

    Comment


    • #3
      Thank you! I noticed that this code didn't transform some of the date variables for dates that only include one number for the month field in the original string variable (i.e., 4/8/2025). Any advice?
      Last edited by Lauren Leviton; 08 Apr 2025, 18:54.

      Comment


      • #4
        OK, this is why instead of answering your post, I should have asked you to post back with example data. Your description was inaccurate: 4/8/2025 does not adhere to the formatting you claimed: MM-DD-YYYY. Consequently, when confronted with "4/8/2025 0:00", the -substr(string_date_variable, 1, 10)- expression picked up 4/8/2025 plus the following space plus the first 0 of 0:00, hence "4/8/2025 0". That does not parse as "MDY", so Stata returns a missing value there. Apparently, your dates are actually somewhat free form, the only consistency being the order MDY. So the code has to account for that:
        Code:
        gen breakpoint = strpos(trim(string_date_variable), " ")
        gen wanted = daily(substr(trim(string_date_variable), 1, breakpoint-1), "MDY")
        format wanted %td
        Note: Given that the string date variable does not actually adhere to any single format, I have also considered the possibility that there may be leading blanks in some of the values--that would also break the original code. That is why -trim()- has made its appearance. The code shown above should work with any string date that is a valid calendar date in month-day-year order regardless of the number of digits in the month and day, and possibly with extra blanks at the beginning or end, so long as there is some delimiter other than a space separating the month, day, and year from each other.

        Even when verbal descriptions of data are accurate, they are often incomplete or ambiguous, and people rarely think to provide the metadata, which is sometimes critical to getting the code right. So for future posting, be sure to read the Forum FAQ for excellent advice on how to get the most out of your Statalist experience. Whenever you need help with code, post example data. Use the -dataex- command to do that. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        Comment


        • #5
          Thank you, Clyde. I will follow your advice for future posts. Appreciate the support.

          Comment


          • #6
            Code:
             
             gen wanted = daily(word(string_date_variable, 1), "MDY") 
            might work too,

            Comment


            • #7
              As might
              Code:
              gen wanted = daily(string_date_variable, "MDY#")
              As an example:

              Code:
              clear
              input str20 date_str
              "03-15-2015 4:50"
              "3/15/2019 0:00"
              "12-3-2018 00:00"
              end
              
              gen wanted = daily(date_str, "MDY#")
              format %td wanted
              produces:

              Code:
              . list, noobs
              
                +-----------------------------+
                |        date_str      wanted |
                |-----------------------------|
                | 03-15-2015 4:50   15mar2015 |
                |  3/15/2019 0:00   15mar2019 |
                | 12-3-2018 00:00   03dec2018 |
                +-----------------------------+
              Last edited by Hemanshu Kumar; 09 Apr 2025, 01:29.

              Comment

              Working...
              X