Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help---convert numbers to time & generate a "date time" variable

    Dear all,

    It's a pleasure to exchange good ideas with you through the forum. It's really helpful!

    Recently, I am trying to figure out two problems related to date and time using STATA 14.0MP. First, currently, all the hours in my dataset are shown as "1,2, ....24" since I used "reshape" to deal with them. And I want to replace 1 with 00:00:00, 24 with 23:00:00. I tried the "replace" but it doesn't work since it always hints me with "type mismatch". Second, I have another three columns showing the date data (year, month, day). I also want to generate a new variable and it looks like 2010-01-01 00:00:00. I tried "generate datetime=clock("Year-Month-Date Time", "YMD hms") but it still doesn't work.

    If someone can help me with the two problems, I will be really grateful. Thank you in advance!

    Best,

    Mengrou

  • #2
    Please read and act on https://www.statalist.org/forums/help#stata and show a data example using dataex

    Comment


    • #3
      Mengrou WANG ,

      I never use hours and minutes in Stata, so you will either need to look those up or someone else can come along, but to combine 3 numeric variables to create a date can be accomplished like this:

      Code:
      * I created some example data
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int year byte(month day)
      2007  5  1
      2009  8  2
      2003  9 14
      2006  1 21
      2008  5 22
      2008  5 29
      2013  1 13
      2013  8 16
      2000 11 27
      2000 12 10
      2004 11 14
      end
      Code:
      * This assumes month, day, & year are *numeric* variables 
      gen date = mdy(month, day, year)  // this is what actually combines the 3 variables into a single date
      format date %td
      clonevar date2 = date
      format date2 %tdCY-N-D  // format as 2007-05-01
      gen date3 = date   // note that date3 is not formatted as a date.
      
      . desc
      
      Contains data
        obs:            11                          
       vars:             6                          
       size:           176                          
      ---------------------------------------------------------------------------------------------------------------------------------------------
                    storage   display    value
      variable name   type    format     label      variable label
      ---------------------------------------------------------------------------------------------------------------------------------------------
      year            int     %8.0g                 Year
      month           byte    %8.0g                 Month
      day             byte    %8.0g                 Day
      date            float   %td                  
      date2           float   %tdCY-N-D            
      date3           float   %9.0g                
      ---------------------------------------------------------------------------------------------------------------------------------------------
      
      
      . list, noobs
      
        +-----------------------------------------------------+
        | year   month   day        date        date2   date3 |
        |-----------------------------------------------------|
        | 2007       5     1   01may2007   2007-05-01   17287 |
        | 2009       8     2   02aug2009   2009-08-02   18111 |
        | 2003       9    14   14sep2003   2003-09-14   15962 |
        | 2006       1    21   21jan2006   2006-01-21   16822 |
        | 2008       5    22   22may2008   2008-05-22   17674 |
        |-----------------------------------------------------|
        | 2008       5    29   29may2008   2008-05-29   17681 |
        | 2013       1    13   13jan2013   2013-01-13   19371 |
        | 2013       8    16   16aug2013   2013-08-16   19586 |
        | 2000      11    27   27nov2000   2000-11-27   14941 |
        | 2000      12    10   10dec2000   2000-12-10   14954 |
        |-----------------------------------------------------|
        | 2004      11    14   14nov2004   2004-11-14   16389 |
        +-----------------------------------------------------+
      For help with adding the hour to the date, see add hours to date variable
      You might also take a look at the user-contributed numdate (SSC install numdate)

      Comment


      • #4
        First, currently, all the hours in my dataset are shown as "1,2, ....24" since I used "reshape" to deal with them. And I want to replace 1 with 00:00:00, 24 with 23:00:00.
        Well, as Stata has complained to you, "23:00:00" and the like are string variables, so you cannot -replace- numbers like 1 through 24 with those values. Nor should you want to. A string variable with values like "01:00:00" through "23:00:00" will be close to useless for any kind of calculations you want to do. What you need instead is a Stata internal time variable that is numerically coded and, for your optical convenience, given a display format that looks like "hh:mm:ss" Assuming your existing variable that goes 1 through 24 is called hr, this will do it:
        Code:
        gen double time_variable = msofhours(hr-1)
        format time_variable %tcHH:MM:SS
        Second, I have another three columns showing the date data (year, month, day). I also want to generate a new variable and it looks like 2010-01-01 00:00:00. I tried "generate datetime=clock("Year-Month-Date Time", "YMD hms") but it still doesn't work.
        This can be done with
        Code:
        gen double datetime = mdyhms(month, day, year, hr-1, 0, 0)
        format datetime %tc
        N.B.: The -double- is absolutely critical. If you omit it, your datetime variable will be incorrect.

        If dealing with dates and times is not something you will be doing in the future, it is fine to just appropriate this code and use it. But if this sort of thing will pop up again in your line of work, you should invest time in running -help datetime- and then clicking on the blue link to the PDF manual near the top of that page. Read the whole thing! It's long and complicated, but you will never become proficient at working with dates and times unless you plough through it. Don't worry if you don't remember every detail after you have read it. The key is to understand how Stata represents dates and times, the various functions for creating them from various types of date and time data that you may find in your data set, ways of extracting components from Stata date time variables, and how to format them in ways that humans can read. Most important: never confuse the display formatting (what you see) with the actual internal representation of the date and time (what Stata works with).

        Finally, a terminology point: Stata data sets do not have rows and columns. They have observations and variables, respectively. Rows and columns are ways to think about spreadsheets. Stata is not a spreadsheet, and the sooner you get out of the habit of thinking about Stata as if it were, the sooner you will be able to understand Stata's distinctive approach to data management and analysis and use Stata effectively. You can speed up that process by deliberately avoiding the use of spreadsheet terminology when thinking and talking about Stata data sets.

        Comment


        • #5
          Dear Nick, David and Clyde,

          Sorry for the late response and many thanks for your reminding and explanation!. They are very helpful and they work!
          Actually, I also read some instructions and found that I can use the following codes to deal with the two problems.

          First, to destring 00:00:00 and replace it with 1, I can use:
          gen double Time=clock(time,"hms")
          format Time %tcHH:MM:SS
          egen hour=group(Time)

          Second, to generate the DateTime, I can use:
          gen Date=mdy(month,day,year)
          format Date %td
          gen double DateTime=dhms(Date,hh(Time),mm(Time),ss(Time))
          format DateTime %tcCCYY.NN.DD_HH:MM:SS ///format it as 2010.01.01 00:00:00

          Thank you all! It's really nice and helpful to consult you from this forum! Have a great holiday!

          Best regards!

          Mengrou

          Comment


          • #6
            Dear Cylde and David how then do I convert a date like the following : 2007-05-01 to only year. I want to keep only the year. Thank you.

            Comment


            • #7

              #6

              If you have a string variable then

              Code:
              gen year = real(substr(whatever, 1, 4))
              may help. Naturally you need to give your variable name if it is not -whatever-.

              If that doesn’t help please read FAQ Advice #12 and show us a data example using -dataex-.

              Comment


              • #8
                This is so cool,..was struggling with the same thing, and combined the insights given in these solutions and worked extremely well. In my case i had three separate variables, each recording only one element of time: Day, month and year, so i used the following code

                gen byte time_var = mdy(month_var, day_var, year_var)
                format time_var %td

                And it worked perfectly well

                Thanks everyone

                Comment

                Working...
                X