Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • mean and sd for hour and minutes

    Hi all,

    I am trying to get mean and sd for a time variable with hour and minutes , but I got differents numbers that I expected:

    Example:

    Code:
    clear all
    set more off
    
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte patid str14 discharge_t
    1 "20110629 10:27"
    2 "20110326 2:15"
    3 "20110409 19:35"
    4 "20120216 2:15"
    5 "20120802 11:59"
    end
    
    generate hora = clock(discharge_t, "YMDhm")
    format hora  %tcHH:MM
    
    sum hora
    
    . sum hora
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
            hora |          5    1.63e+12    1.87e+10   1.62e+12   1.66e+12
    
    
    When I tried to get mean only for 1 first observation I should get 10.27
    
     sum hora in 1
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
            hora |          1    1.62e+12           .   1.62e+12   1.62e+12
    Please any comments I would grateful
    Regards
    Rodrigo

  • #2
    Stata stores date-times as (large) integer numbers, and you are seeing the output of summarize in terms of those numbers. I would suggest two changes:

    When you are trying to summarize, use the option that displays results in terms of the display format you have selected:

    Code:
    . sum hora in 1, format
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
            hora |          1       10:26           .      10:26      10:26
    As you can see, this is closer to what you expected but not entirely correct. And that is because of precision issues. By default, Stata generates numerical variables as float, whereas you need double precision for date-times. So if you did:

    Code:
    gen double hora_new = clock(discharge_t, "YMDhm")
    format hora_new %tcHH:MM
    you would get the expected result:
    Code:
    . sum hora_new in 1, format
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
        hora_new |          1       10:27           .      10:27      10:27
    You might also want to check out
    Code:
    help precision

    Comment


    • #3
      One further note: beyond the issues addressed above, note that your hora variable is actually storing the entire date-time, even though you are choosing to display only the time of day. As such, taking the mean, standard deviation etc may produce results you are not expecting because those statistics are being calculated over the date-time and not just the time of day. For instance, even if all your observations were for the exact same time, say 9:00, but on different days, the standard deviation would be non-zero.

      Comment


      • #4
        Thanks Hemanshu Kumar for your full explanation.

        Comment


        • #5
          Apart from these technical issues, you need to take into account that time of day is circular data: 23:59 and 0:00 are only one minute apart but if you compute the mean they will be treated as very far apart. So computing a mean for time of day or direction on a compas is different from computing a mean for other variables. You can read more here: https://ideas.repec.org/p/boc/asug01/5.3.html

          The specialist on this topic is Nick Cox , who I tag here in case he has something useful to add.
          Last edited by Maarten Buis; 31 May 2024, 03:54.
          ---------------------------------
          Maarten L. Buis
          University of Konstanz
          Department of history and sociology
          box 40
          78457 Konstanz
          Germany
          http://www.maartenbuis.nl
          ---------------------------------

          Comment


          • #6
            Maarten Buis Thanks for the mention with special reference to circular data.

            Otherwise my only qualifications here are more experience than some and a willingness to read the documentation.

            To expand on already helpful answers:

            In essence, datetimes are milliseconds from the start of 1960, which means for recent datetimes values in trillions. The use of milliseconds as a unit is driven by the fact that many users have financial data at that time resolution, while conversely physicists and other scientists with need for e.g. nanosecond resolution are unlikely to be using Stata and if they are using Stata they should be smart enough to invent their own machinery.

            I have a hard time imagining that standard deviations of such date-times are useful and interesting and an even harder time imagining that means of such date-times are useful and interesting, but if they are then it's more likely that you need to step outside the Stata framework and use your own variables with sensible units e.g. hours or minutes.

            I'd rather alert Rodrigo Badilla to expand on what he wants here. I see in #1 a little dataset with just one observation for each distinct identifier. so the only means and SDs that can be calculated are across individuals.

            Time of day is as Maarten points out a circular scale.

            What I've seen more often are means and SDs of durations or lengths of time, such as the time spent in hospital or a waiting room. Such durations are differences between datetimes, and so calculated in principle by subtraction and in practice by working with time units that make sense downstream, most often hours or minutes. Durations such as length of employment seem in practice to be associated with daily or monthly dates.

            Comment


            • #7
              Thanks for your comments. I used dataex and data: use https://www.stata-press.com/data/r18/visits (Fictional hospital visit data) only to show my problem getting the mean with format.

              Thanks Nick Cox for your advise. In my #1 I used:

              sum in 1

              Only to check that my mean its displayed properly, when I ask for the mean for only one observation, Its not my purpose get the mean for repeated measure for patient 1. Its was only for show my problem.

              My real data is the time that one patient is connected to mechanical ventilation. I dont get date, only time, then I need the mean for time connected to mechanical ventilation.

              Sorry if in my example#1 I not delete the date.

              My real data look like:

              Code:
              clear
              input byte patid str14 time_MV
              1 "10:27:00"
              2 "2:15:00"
              3 "19:35:00"
              4 "2:15:00"
              5 "11:59:00"
              end
              
              generate double horas = clock(time_MV, "hms")
              
              format horas  %tcHH:MM
              
              sum horas, format
              
                  Variable |        Obs        Mean    Std. dev.       Min        Max
              -------------+---------------------------------------------------------
                     horas |          5   09:18    2.63e+07  02:15  19:35
              I dont know if my data have the problem with circular data and I dont know if this format or Clock function its used properly for my real data.

              I will read the Nick Cox paper.

              Thanks
              Rodrigo
              Last edited by Rodrigo Badilla; 31 May 2024, 09:03.

              Comment


              • #8
                This is a case where Stata's datetime machinery creates as many problems as it solves. Just build your own code directly.

                Some time ago, I wrote a bundle of things for dealing with times -- but in most respects it was rendered instantly obsolete by Stata's introduction of that machinery not long after.

                Nevertheless https://journals.sagepub.com/doi/pdf...867X0600600109 is on record. Rather than use any of those commands. here is what I would do:

                Code:
                clear
                input byte patid str14 time_MV
                1 "10:27:00"
                2 "2:15:00"
                3 "19:35:00"
                4 "2:15:00"
                5 "11:59:00"
                end
                
                split time_MV, gen(t) destring parse(:)
                
                gen wanted = 60 * t1 + t2 + t3/60 
                
                l 
                
                su wanted
                .
                Code:
                . l 
                
                     +------------------------------------------+
                     | patid    time_MV   t1   t2   t3   wanted |
                     |------------------------------------------|
                  1. |     1   10:27:00   10   27    0      627 |
                  2. |     2    2:15:00    2   15    0      135 |
                  3. |     3   19:35:00   19   35    0     1175 |
                  4. |     4    2:15:00    2   15    0      135 |
                  5. |     5   11:59:00   11   59    0      719 |
                     +------------------------------------------+
                
                . 
                . su wanted 
                
                    Variable |        Obs        Mean    Std. dev.       Min        Max
                -------------+---------------------------------------------------------
                      wanted |          5       558.2    438.5284        135       1175
                I chose minutes as the unit. If you want to use hours, seconds, or other any time unit, you know how to do that.

                In the event that someone is on ventilation for more than 24 hours, I can imagine that the time might be recorded in the form days: hours: min : sec in which case how to extend the approach above should be clear. If the hours just go higher than 24, nothing is broken thereby.

                Comment


                • #9
                  Thanks Nick Cox for your reply, solution my problem.

                  Comment


                  • #10
                    Hello I have a difficulty with the data I want to calculate the dates and the time in hours to generate new variables for how many hours it takes for the surgery and to resume with the treatment:

                    (Date of last dose administred before surgery + Time of last anticogulant dose BEFORE surgery )
                    (Surgery Date + Start Time of surgery)
                    (Surgery Date + End Time of surgery)
                    (Date of resuming+ Time of resuming anticogulant post surgery)

                    dataex SurgeryDate StartTimeofsurgery Dateoflastdoseadministredbe TimeoflastanticogulantdoseB SurgeryDate EndTimeofsurgery Dateof
                    > resuming Timeofresuminganticogulantpo

                    ----------------------- copy starting from the next line -----------------------
                    Code:
                    * Example generated by -dataex-. For more info, type help dataex
                    clear
                    input int SurgeryDate double StartTimeofsurgery int Dateoflastdoseadministredbe double(TimeoflastanticogulantdoseB EndTimeofsurgery) str11 Dateofresuming double Timeofresuminganticogulantpo
                    21331      -1893419340000 21331      -1.8934236e+12      -1893413280000 "29may2018"      -1.8934236e+12
                    21376      -1893424920000 21375      -1.8933804e+12 -1893420659999.9998 "12jul2018"      -1.8933786e+12
                    22992      -1893419340000     .                   .      -1893414420000 "13dec2022"      -1.8933729e+12
                    22202       -1.893426e+12 22201      -1.8933804e+12      -1.8934206e+12 "15oct2020"      -1.8933912e+12
                    22727      -1.8934053e+12 22726      -1.8934236e+12      -1.8933987e+12 "24mar2022"      -1893421680000
                    22944      -1893414060000     .                   .      -1.8933969e+12 "27oct2022"      -1.8934341e+12
                    22225      -1893370380000     .                   .      -1.8934476e+12 "07nov2020"      -1893412920000
                    22928      -1893417240000 22927      -1.8933984e+12      -1.8933825e+12 "11oct2022"      -1.8933984e+12
                    22172 -1893443999999.9998     .                   .      -1.8934203e+12 "17sep2020"      -1.8933804e+12
                    22257 -1893412260000.0002 22987      -1.8934542e+12      -1893400260000 "09dec2020"      -1.8934236e+12
                    22858      -1893418080000     .                   . -1893408120000.0002 "02aug2022" -1893455460000.0002
                    22270      -1893412020000 22269 -1893422580000.0002      -1893407520000 "21dec2020"      -1.8933786e+12
                    23032      -1893384060000 23032      -1.8934215e+12      -1893376860000 "23jan2023"      -1893421260000
                    22945 -1893422399999.9998 22944      -1893405060000      -1893402840000 "27oct2022" -1893370800000.0002
                    23097      -1893419280000     .                   . -1893405900000.0002 "29mar2023"      -1893427860000
                    22892      -1893419040000 22891      -1893380520000 -1893393360000.0002 "05sep2022"      -1893435060000
                    22871      -1893420120000 22870      -1893408480000      -1.8934164e+12 "15aug2022"      -1893377280000
                    21988      -1893438960000     .                   .      -1.8934149e+12 "15mar2020"      -1893377580000
                    21452      -1893417420000 21451      -1893370740000      -1893404580000 "26sep2018"       -1.893456e+12
                    21522      -1893408960000 21521      -1.8933768e+12      -1.8933912e+12 "05dec2018"      -1893413460000
                    21572 -1893426059999.9998 21571      -1.8933804e+12 -1893422100000.0002 "23jan2019"      -1.8933804e+12
                    21663 -1893397320000.0002 21662      -1.8933804e+12      -1893396540000 "24apr2019"      -1893376260000
                    21754      -1.8934104e+12 21753      -1.8933768e+12      -1893391560000 "25jul2019"      -1893438480000
                    21929      -1.8934134e+12 21928      -1893378660000      -1.8934038e+12 "15jan2020"      -1.8933798e+12
                    21354      -1893421320000 21353      -1.8934236e+12 -1893397320000.0002 "20jun2018"      -1.8934149e+12
                    21360      -1.8934128e+12     .                   .      -1893392280000 "26jun2018"      -1.8934485e+12
                    21369      -1893424920000 21368      -1893378540000      -1893412920000 "04jul2018"      -1.8933768e+12
                    21394 -1893421919999.9998 21393      -1.8933804e+12      -1893414540000 "29jul2018"      -1.8933768e+12
                    21395      -1893405780000     .                   .      -1893399960000 "30jul2018"      -1.8933804e+12
                    21592      -1.8934263e+12 21591      -1893378060000      -1893422160000 "12feb2019"      -1.8933804e+12
                    21740      -1.8933753e+12     .                   .      -1893369720000 "11jul2019"      -1.8934236e+12
                    21557      -1.8934074e+12 21556      -1.8933774e+12      -1.8933837e+12 "01sep2019" -1893436560000.0002
                    21481      -1893426240000 21480      -1.8933804e+12      -1893420420000 "24oct2018"      -1893370920000
                    21754      -1893422520000 21753      -1.8934092e+12      -1893420420000 ""                            .
                    23072      -1.8933966e+12     .                   .      -1893405780000 "03may2023"      -1893394980000
                    21443      -1893409140000 21442 -1893378900000.0002      -1893399240000 "16sep2018"      -1.8933732e+12
                    21831      -1893425820000     .                   .      -1.8934131e+12 "09oct2019"      -1.8933732e+12
                    21830      -1.8934125e+12 21829      -1.8933804e+12 -1893403200000.0002 "09oct2019"      -1893438060000
                    21744 -1893408600000.0002 21742      -1893375480000      -1.8933915e+12 "14jul2019"      -1893372420000
                    21605      -1.8934026e+12 21604      -1.8934257e+12      -1893399780000 "26jul2019"      -1893418980000
                    21632 -1893420360000.0002 21631      -1893386280000      -1893397080000 "25mar2019" -1893441480000.0002
                    21627      -1893419520000 21626      -1.8933804e+12 -1893408600000.0002 "19mar2019"       -1.893378e+12
                    21466      -1.8934155e+12     .                   .      -1.8933807e+12 "10oct2018"      -1893435060000
                    21751 -1893422399999.9998 21750      -1.8933894e+12      -1893418140000 "21jul2019" -1893373319999.9998
                    21650      -1893409320000 21648      -1893379920000      -1893389760000 "12apr2019"      -1.8934455e+12
                    21635      -1893426360000     .                   .      -1.8934182e+12 "28mar2019"      -1.8933912e+12
                    21657      -1893420240000     .                   .      -1.8934026e+12 "19apr2019"      -1.8933804e+12
                    21664      -1893411480000     .                   .      -1893393120000 "26apr2019"      -1893419040000
                    21739      -1893415920000 21738      -1893379140000      -1.8933855e+12 "10jul2019"      -1.8933753e+12
                    21325      -1.8934131e+12 21324      -1893379080000      -1.8933993e+12 "22may2018"      -1.8933705e+12
                    21331      -1893415320000 21330      -1.8933804e+12      -1893390060000 "28may2018"      -1893415680000
                    21474      -1893412020000 21473      -1.8934191e+12      -1893409020000 "19oct2018"      -1893422340000
                    21515      -1.8934188e+12     .                   . -1893414000000.0002 "27nov2018" -1893375240000.0002
                    21544      -1893419340000     .                   .      -1893412920000 "26dec2018"      -1.8933804e+12
                    21601      -1893414420000     .                   .      -1893387840000 "22feb2019" -1893374759999.9998
                    22740      -1893403920000     .                   .      -1.8934002e+12 "05apr2022"      -1.8933747e+12
                    22601 -1893423359999.9998 22600      -1893402660000      -1893416880000 "18nov2021"      -1.8934092e+12
                    22783 -1893407640000.0002 22782      -1.8933885e+12 -1893390660000.0002 "19may2022"      -1.8934452e+12
                    22544 -1893403200000.0002 22543      -1.8933768e+12      -1893401640000 "21sep2021"      -1.8933768e+12
                    22448      -1.8934218e+12 22447      -1.8933786e+12 -1893419400000.0002 "18jun2021"      -1.8934398e+12
                    23052      -1893415380000     .                   .       -1.893408e+12 "13feb2023"      -1.8933804e+12
                    22462       -1.893426e+12     .                   .      -1893419040000 "01jul2021"      -1.8933771e+12
                    22179      -1893412680000 22178      -1893426120000 -1893395399999.9998 "22sep2020"      -1893414660000
                    22164 -1893425579999.9998 22163      -1893379320000      -1893422040000 "07sep2020" -1893454799999.9998
                    23001 -1893425760000.0002 23000      -1.8934071e+12      -1893420060000 "23dec2022"      -1893421380000
                    22431      -1.8934203e+12     .                   . -1893392400000.0002 "01jun2021"      -1.8933804e+12
                    23093      -1.8934128e+12 23092      -1.8933804e+12       -1.893411e+12 "24mar2023" -1893379380000.0002
                    22573      -1893403380000 22572      -1.8933804e+12 -1893400980000.0002 "21oct2021"      -1.8933966e+12
                    22476 -1893395580000.0002 22474      -1893381480000      -1.8933831e+12 "18jul2021"       -1.893438e+12
                    22879 -1893423840000.0002 22878      -1.8933984e+12      -1893421680000 "22aug2022"      -1.8933984e+12
                    22385      -1893418620000 22383      -1.8933984e+12 -1893405900000.0002 "16apr2021"      -1893451260000
                    22186      -1893416340000 22184      -1.8933912e+12      -1.8934104e+12 "28sep2020"      -1.8933768e+12
                    22328 -1893422399999.9998 22327      -1893389760000      -1893379620000 "18feb2021" -1893424139999.9998
                    23084 -1893419880000.0002 23083      -1893392640000 -1893389999999.9998 "19mar2023"      -1893379440000
                    22994      -1893405360000 22993      -1.8933774e+12      -1893393120000 "20dec2022"       -1.893456e+12
                    22844       -1.893411e+12 22843 -1893376979999.9998      -1893406560000 "18jul2022" -1893373500000.0002
                    22930 -1893408899999.9998 22929      -1893375480000      -1.8934068e+12 "12oct2022" -1893370800000.0002
                    22815      -1893408720000 22814      -1.8933906e+12      -1893401880000 "20jun2022"       -1.893447e+12
                    21623 -1893416220000.0002 21622 -1893387960000.0002      -1893412860000 "16mar2019"      -1.8933894e+12
                    22237      -1893411360000 22236      -1893391080000      -1893400680000 "18nov2020"      -1893373140000
                    22223      -1893426120000 22222      -1.8933822e+12      -1.8934188e+12 "04nov2020"      -1.8933768e+12
                    22234 -1893417479999.9998 22233      -1.8933882e+12 -1893395100000.0002 "16nov2020"      -1.8933804e+12
                    21823 -1893422580000.0002 21822      -1.8933804e+12      -1.8934188e+12 "01oct2019"      -1.8933804e+12
                    22103      -1893410520000 22102      -1.8933876e+12      -1.8934044e+12 "07jul2020" -1893370800000.0002
                    22161      -1.8934017e+12     .                   .      -1893394680000 "03sep2020"      -1.8933732e+12
                    22165      -1893424380000 22164      -1.8933912e+12      -1.8934203e+12 "07sep2020"      -1.8933771e+12
                    22892 -1893414960000.0002 22891      -1893403260000      -1893409620000 "04sep2022" -1893379380000.0002
                    22158      -1.8934047e+12 22157      -1.8933786e+12       -1.893396e+12 "31aug2020"      -1893382260000
                    22040       -1.893372e+12     .                   .      -1.8934524e+12 "05may2020"      -1.8933813e+12
                    22453       -1.893411e+12     .                   . -1893405239999.9998 "23jun2021"      -1893444360000
                    22895      -1893399420000     .                   . -1893395399999.9998 "08sep2022"      -1.8934395e+12
                    22543 -1893421439999.9998 22542      -1.8933804e+12      -1.8933864e+12 "21sep2021"      -1.8933702e+12
                    22343      -1893399480000 22342      -1.8934506e+12 -1893396060000.0002 "04mar2021"      -1.8933804e+12
                    22439 -1893425579999.9998     .                   .      -1.8934092e+12 "08jun2021"      -1.8933714e+12
                    22901      -1893405840000     .                   .      -1893393720000 "14sep2022"       -1.893438e+12
                    22878 -1893424800000.0002     .                   .      -1.8934212e+12 "21aug2022"       -1.893372e+12
                    22329      -1893423420000     .                   . -1893414299999.9998 "18feb2021"      -1.8933702e+12
                    22225      -1893388920000     .                   .      -1893384660000 "07nov2020"      -1893400920000
                    22517      -1893415860000 22516       -1.893378e+12 -1893410639999.9998 "26aug2023"      -1.8934488e+12
                    23126      -1893391020000     .                   .      -1893387060000 "27apr2023"       -1.893402e+12
                    end
                    format %td SurgeryDate
                    format %tchH:MM StartTimeofsurgery
                    format %td Dateoflastdoseadministredbe
                    format %tcHH:MM:SS TimeoflastanticogulantdoseB
                    format %tchH:MM EndTimeofsurgery
                    format %tchH:MM Timeofresuminganticogulantpo
                    ------------------ copy up to and including the previous line ------------------




                    Thank you

                    Comment

                    Working...
                    X