Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dates and Time calculation

    Hi All,

    Below is the data I got from a collaborator, column 3 is the combined data with both dates and time., however, it's a string. I want to calculate the length of stay by using dischargetime - combinedarrivaltime. I have trouble with this because those 2 columns are in different formats. Stata gives me some weird number. I'm not familiar with dates formats in stata. Could anyone give me some clue? Thanks so much!
    Click image for larger version

Name:	Capture.JPG
Views:	1
Size:	29.8 KB
ID:	1452102

  • #2
    Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

    All Stata manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

    Perhaps if you post usable data, others will be more likely to provide usable code for you. Please be sure to use the dataex command to show your example data. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it. dataex will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use dataex.

    Comment


    • #3
      Please use dataex in the future - it helps people help you.

      I tryed creating your dataset:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float arrivaldate str4 arrivaltime str12 combarrival str18 dischargetime
      19783 "4:27" "3/1/14 16:27" "07mar2014 14:29:59"
      19724 "5:59" "1/1/14 17:59" "04jan2014 18:40:00"
      19726 "3:49" "1/3/14 03:49" "31jan2014 14:26:00"
      19726 "4:54" "1/3/14 04:54" "06jan2014 19:36:00"
      19727 "1:03" "1/4/14 01:03" "10jan2014 18:14:59"
      19728 "1:40" "1/5/14 01:40" "07jan2014 13:38:00"
      end
      format %td arrivaldate
      However, when converting your discharge/arrival time to the appropriate format, I get slightly different data. I won't spend time trying to fix it. If you really want to find a fix for it, type - help datetime - on your Stata. Going to your question, which was how to calculate length of stay.

      Code:
      gen discharge = clock(dischargetime, "DMY hms")
      format discharge %tc
      gen arrival = clock(combarrival, "MD20Y hm")
      format arrival %tc
      gen daysdif = hours(discharge - arrival)/24
      Please consider sharing data with dataex in the future.

      Best;

      Comment


      • #4
        From help datetime:
        Warning: To prevent loss of precision, datetime SIFs must be stored as doubles

        Comment


        • #5
          Good to know Robert! Thanks

          Comment


          • #6
            Thanks everyone! That works perfectly. Also thanks for the suggestions for useful learning materials and how to post code on Statalist.

            Comment


            • #7
              I have a data that looks like below, here variable start=survey starttime , variable end= survey endtime. I need to generate a variable which will give survey duration in hours.

              input str3 queationnair_no str22(start end)
              "310" "17/11/2019 03:15:56 PM" "17/11/2019 03:58:03 PM"
              "302" "16/11/2019 11:02:43 AM" "16/11/2019 11:37:47 AM"
              "303" "16/11/2019 11:46:43 AM" "16/11/2019 12:20:38 PM"
              "312" "17/11/2019 04:52:14 PM" "17/11/2019 05:19:46 PM"
              "311" "17/11/2019 04:10:12 PM" "17/11/2019 04:39:19 PM"
              "309" "17/11/2019 12:45:46 PM" "17/11/2019 01:07:02 PM"
              "306" "17/11/2019 11:05:34 AM" "17/11/2019 11:29:21 AM"
              "301" "16/11/2019 09:25:32 AM" "16/11/2019 10:42:29 AM"

              After using following code

              gen double start_time1 = clock(start, "DMYhms")
              format start_time1 %tc

              gen double end_time1 = clock(end, "DMYhms")
              format end_time1 %tc

              gen double survey_time1=hours(end_time1-start_time1)

              gen double survey_time=end_time1-start_time1
              format survey_time %tc

              The output I get is as follows;

              survey_time1 survey_time
              1.125 01jan1960 01:07:30
              1.2341667 01jan1960 01:14:03
              .84472222 01jan1960 00:50:41
              1.1219444 01jan1960 01:07:19
              .75888889 01jan1960 00:45:32

              Here the survey_time1 gives me values like 0.84 which should be 0.50 mins, survey_time shows accurate duration but I cannot use it for analysis due to datetime format, how can I fix this so that I get accurate time values in hours ?
              Last edited by Abha Indurkar; 18 Nov 2019, 06:35.

              Comment


              • #8
                #7 Your small punishment is that you got what you asked for. The difference between two date-times is not itself a date-time, just as the difference between two dates is not a date. Stata uses a time origin of the first possible date or time in 1960 (except for years). But the difference between say yesterday and today is 1 day, not 2 January 1960, which what is shown by di %td 1 .

                All you need to do is use a display format that ignores the part of the default that has no bearing on your data.

                Code:
                . di %tcHH:MM:SS   clock("17/11/2019 03:58:03 PM", "DMY hms") - clock("17/11/2019 03:15:56 PM", "DMY hms")
                00:42:07
                Things might get messier if your survey lasted longer than 24 hours, which I guess wildly is unlikely.

                Comment


                • #9
                  Thank you so much, Nick.

                  Comment

                  Working...
                  X