Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Date in scientific number format

    Dear Team,

    I tried many options for formatting my dates, but they only appear as a scientific number. I tried routing through converting them to normal digits, but i still cannot format them as dates. What i see in browse mode is 2,0e+7 for each entry. When i click on it i see changing numbers for each entry in the format 20060531. I read through the help file for date & time, & tried many commands with no success. My dates are type Long & Format %8.0g in Stata 13.
    I would very much appreciate it if you can help me get them in HRF "YMD", which I can manipulate as dates, since I need to then extract specific dates & months.

    Thank you in advance.
    Best,
    Milena

  • #2
    Look at this:
    Code:
    . clear
    . set obs 1
    . gen long date1 = 20060531
    . gen date2 = string(date1, "%9.0f")
    . gen date3 = date(date2, "YMD")
    . gen date4 = date3
    . format date4 %td
    . list, clean
    
              date1      date2   date3       date4 
      1.   20060531   20060531   16952   31may2006

    Comment


    • #3
      Thank ypu

      Comment


      • #4
        Dear Svend,

        I've been struggling with my dates in Stata for a while now, and your post almost completely helped me. I've got one small problem though, and I hope you will be able to guide me in the right direction.

        My data is entered as 7112010 being 7 November 2010, and I've followed the code you've set above to get this to a SIR format. The only problem I encounter is that when there is only 1 number for a date or month (so 74 being 7 April), between date2 and date3 Stata decided it's a missing value. Let me illustrate this with the code I've used, which you can find in the attached log file.

        I hope you're able to help me.

        Best wishes,

        Hannah

        Converting data.smcl

        Comment


        • #5
          I don't think this problem is solvable, Hannah. Coding dates in this way makes them inherently impossible to unambiguously parse. What would 1112010 be? Would it be 1 November or 11 January? There is no way to know. Stata has no way to figure out which digits are the month and which are the day. You need to go back to the source of your data and fix it there.

          Comment


          • #6
            I think that's a little pessimistic. Here's a solution for Hannah's example data. The strategy is to peel off the last 4 characters, then the next last 2, and then what is left is the day. I've thrown in also an example with numdate (SSC) (but numdate applied to the starting string stumbles on the shorter dates.

            Code:
            . clear 
            
            . input long mydate 
            
                       mydate
              1.       .a 
              2. 18122011 
              3.  7112010 
              4.       .a 
              5.       .a 
              6. 26012012 
              7.  6102010 
              8. 30032010 
              9.  9082011 
             10. 13102010 
             11.       .a 
             12. 15112010 
             13. 23012010 
             14. 27092011 
             15. 20062010 
             16. end 
            
            . 
            . gen work = string(mydate, "%9.0f") 
            
            . gen year = real(substr(work, -4,4)) 
            (4 missing values generated)
            
            . replace work = substr(work, 1, length(work) - 4) 
            (15 real changes made)
            
            . gen month = real(substr(work, -2, 2)) 
            (4 missing values generated)
            
            . gen day = real(substr(work, 1, length(work) - 2)) 
            (4 missing values generated)
            
            . gen nmydate = mdy(month, day, year) 
            (4 missing values generated)
            
            . format nmydate %td 
            
            . 
            . numdate daily nmydate2 = day month year, pattern(DMY) 
            (4 missing values generated)
            
            . 
            . list 
            
                 +--------------------------------------------------------------+
                 |   mydate   work   year   month   day     nmydate    nmydate2 |
                 |--------------------------------------------------------------|
              1. |       .a             .       .     .           .           . |
              2. | 18122011   1812   2011      12    18   18dec2011   18dec2011 |
              3. |  7112010    711   2010      11     7   07nov2010   07nov2010 |
              4. |       .a             .       .     .           .           . |
              5. |       .a             .       .     .           .           . |
                 |--------------------------------------------------------------|
              6. | 26012012   2601   2012       1    26   26jan2012   26jan2012 |
              7. |  6102010    610   2010      10     6   06oct2010   06oct2010 |
              8. | 30032010   3003   2010       3    30   30mar2010   30mar2010 |
              9. |  9082011    908   2011       8     9   09aug2011   09aug2011 |
             10. | 13102010   1310   2010      10    13   13oct2010   13oct2010 |
                 |--------------------------------------------------------------|
             11. |       .a             .       .     .           .           . |
             12. | 15112010   1511   2010      11    15   15nov2010   15nov2010 |
             13. | 23012010   2301   2010       1    23   23jan2010   23jan2010 |
             14. | 27092011   2709   2011       9    27   27sep2011   27sep2011 |
             15. | 20062010   2006   2010       6    20   20jun2010   20jun2010 |
                 +--------------------------------------------------------------+

            Comment


            • #7
              But Nick, that presumes that 1112010 is 1 November. If the actual date is supposed to be 11 January, it will be mis-coded as 1 November. She won't have missing values, but she'll have some incorrect ones instead.

              Comment


              • #8
                The problem, Nick, is that Hannah does not always have two digits for the month. In her example, 74 is 7 April.

                Comment


                • #9
                  Good point. That case doesn't arise in the example data, but the code can be elaborated: I added "742015".

                  Code:
                  . clear 
                  
                  . input long mydate 
                  
                             mydate
                    1.       .a 
                    2. 18122011 
                    3.  7112010 
                    4.       .a 
                    5.       .a 
                    6. 26012012 
                    7.  6102010 
                    8. 30032010 
                    9.  9082011 
                   10. 13102010 
                   11.       .a 
                   12. 15112010 
                   13. 23012010 
                   14. 27092011 
                   15. 20062010 
                   16. 742015 
                   17. end 
                  
                  . 
                  . gen work = string(mydate, "%9.0f") 
                  
                  . gen year = real(substr(work, -4,4)) 
                  (4 missing values generated)
                  
                  . replace work = substr(work, 1, length(work) - 4) 
                  (16 real changes made)
                  
                  . gen month = real(substr(work, -2, 2)) 
                  (4 missing values generated)
                  
                  . gen day = real(substr(work, 1, length(work) - 2)) 
                  (5 missing values generated)
                  
                  . gen nmydate = mdy(month, day, year) 
                  (5 missing values generated)
                  
                  . format nmydate %td 
                  
                  . 
                  . replace month = real(substr(work, -1, 1)) if missing(nmydate)  
                  (1 real change made)
                  
                  . replace day = real(substr(work), 1, 1) if missing(nmydate) 
                  (1 real change made)
                  
                  . replace nmydate = mdy(month, day, year) if missing(nmydate) 
                  (1 real change made)
                  
                  . 
                  . numdate daily nmydate2 = day month year, pattern(DMY) 
                  (4 missing values generated)
                  
                  . 
                  . list 
                  
                       +--------------------------------------------------------------+
                       |   mydate   work   year   month   day     nmydate    nmydate2 |
                       |--------------------------------------------------------------|
                    1. |       .a             .       .     .           .           . |
                    2. | 18122011   1812   2011      12    18   18dec2011   18dec2011 |
                    3. |  7112010    711   2010      11     7   07nov2010   07nov2010 |
                    4. |       .a             .       .     .           .           . |
                    5. |       .a             .       .     .           .           . |
                       |--------------------------------------------------------------|
                    6. | 26012012   2601   2012       1    26   26jan2012   26jan2012 |
                    7. |  6102010    610   2010      10     6   06oct2010   06oct2010 |
                    8. | 30032010   3003   2010       3    30   30mar2010   30mar2010 |
                    9. |  9082011    908   2011       8     9   09aug2011   09aug2011 |
                   10. | 13102010   1310   2010      10    13   13oct2010   13oct2010 |
                       |--------------------------------------------------------------|
                   11. |       .a             .       .     .           .           . |
                   12. | 15112010   1511   2010      11    15   15nov2010   15nov2010 |
                   13. | 23012010   2301   2010       1    23   23jan2010   23jan2010 |
                   14. | 27092011   2709   2011       9    27   27sep2011   27sep2011 |
                   15. | 20062010   2006   2010       6    20   20jun2010   20jun2010 |
                       |--------------------------------------------------------------|
                   16. |   742015     74   2015       4     7   07apr2015   07apr2015 |
                       +--------------------------------------------------------------+

                  Comment


                  • #10
                    That's fine as far as it goes, but it still doesn't correct the inherent ambiguity of 1112010 as 11 January or 1 November 2010. It will always parse this as 1 November, even though the correct date could be either 1 November or 11 January.

                    Comment


                    • #11
                      I said a little pessimistic; you said not solvable. Those attitudes can perhaps stretch to touch.

                      Really careful code would have to flag all the ambiguous cases. Perhaps there is other evidence on dates in the data that could help.

                      Comment


                      • #12
                        Fair enough!

                        Comment


                        • #13
                          Dear Nick, Clyde & Eric,

                          Thank you so much for your help. This has now turned a row of numbers into an SIR date, which is great. I will still have to check that all dates are correct, and will have to work out ambiguities.

                          Again, many thanks I could never have come up with this myself!

                          Have a good weekend,

                          Hannah

                          Comment


                          • #14
                            Dear all,


                            Currently I have a problem with a shared database and the dates contain

                            They are spoused to be %tcnn/dd/ccYY_hh:MM, but for some reason is in scientific notation (%10.0g)


                            Code:
                            * Example generated by -dataex-. For more info, type help dataex
                            clear
                            input double(Arri_lab_hemat_Hb_datetime Arri_lab_bg_Hb_datetime Arri_lab_chem_ALAT_datetime)
                                2001071940000                 .     2001071940000
                            2006535479999.998 2006536919999.998 2006535479999.998
                            2010404159999.998 2010404280000.002 2010404159999.998
                                1993064220000     1993064640000     1993064220000
                                2012730180000 2012731740000.002     2012730180000
                            1993116899999.998     1993116540000 1993116899999.998
                            2005770660000.002     2005771320000 2005770660000.002
                                1997207340000 1997208120000.002     1997207340000
                            2001762480000.002     2001762960000 2001762480000.002
                            2004551399999.998 2004551519999.998 2004551399999.998
                                2020958940000     2020959540000     2020958940000
                                1.9984137e+12     1998414060000     1.9984137e+12
                            1998823019999.998     1998823320000 1998823019999.998
                                2000542440000     2000543280000     2000542440000
                                2010840960000     2010840480000     2010840960000
                            1999536659999.998     1.9995381e+12 1999536659999.998
                                2.0092023e+12     2009202720000     2.0092023e+12
                                2002446660000 2002447679999.998     2002446660000
                                2013261540000 2013263039999.998     2013261540000
                            2013276240000.002 2013284760000.002 2013276240000.002
                                1997702820000 1997703299999.998     1997702820000
                                2005567440000 2005569600000.002     2005567440000
                            2009095680000.002     2009096160000 2009095680000.002
                            2007590880000.002 2007591599999.998 2007590880000.002
                                2015965680000     2015965560000     2015965680000
                            2012137680000.002     2012137860000 2012137680000.002
                                2007197460000     2007205980000     2007197460000
                                2002206180000                 .     2002204020000
                                2020499820000     2020500960000     2020499820000
                                 2.017665e+12                 .      2.017665e+12
                                1996051020000     1996052040000     1996051020000
                                2014217640000     2014218060000     2014217640000
                                2.0160126e+12 2016014100000.002     2.0160126e+12
                                2000220240000     2000220360000     2000220240000
                            2020076700000.002 2020077540000.002 2020076700000.002
                            2020107959999.998     2020109280000 2020107959999.998
                                2003924460000     2003925540000     2003924460000
                            1993674959999.998     1993675860000 1993674959999.998
                                2011490340000     2.0114901e+12     2011490340000
                                2.0003796e+12     2000380680000                 .
                                            .                 .                 .
                            2013603000000.002 2013603780000.002 2013603000000.002
                                 2.011608e+12     2011612260000      2.011608e+12
                            2008751940000.002     2.0087532e+12 2008751940000.002
                                2019662820000     2019664380000                 .
                                1.9930356e+12     1993035960000     1.9930356e+12
                            1997179139999.998 1997179680000.002 1997179139999.998
                            2017475039999.998 2017475879999.998 2017475039999.998
                                2.0036529e+12 2003653500000.002     2.0036529e+12
                            2006625720000.002                 . 2006625720000.002
                                2017485180000     2017485720000     2017485180000
                            2018872380000.002 2018872380000.002 2018872380000.002
                                2.0133285e+12     2013329040000     2.0133285e+12
                            2001150960000.002     2001151320000 2001150960000.002
                                2016441720000 2016442920000.002     2016441720000
                                2.0103255e+12 2010326580000.002     2.0103255e+12
                                1999249440000     1999250820000     1999249440000
                                2010067140000 2010067499999.998     2010067140000
                                1998110460000                 .     1998110460000
                            2017309019999.998 2017313640000.002 2017309019999.998
                            1998311040000.002                 . 1998311040000.002
                                2.0121363e+12 2012136900000.002     2.0121363e+12
                                2016020940000     2016016860000     2016020940000
                                1994335380000     1994335680000     1994335380000
                                1998721620000     1998721860000     1998721620000
                            1989485700000.002     1989487560000 1989485700000.002
                                2.0072997e+12                 .     2.0072997e+12
                                1996777020000     1996777920000     1996777020000
                            2002792439999.998     2002793220000 2002792439999.998
                                2017912260000 2017914659999.998     2017912260000
                            1993814519999.998     1993817520000 1993814519999.998
                                2019756780000     2019758340000     2019756780000
                                2002927320000     2.0029275e+12     2002927320000
                                2009117580000 2009118240000.002     2009117580000
                            2015757960000.002     2015758920000 2015757960000.002
                                2013353760000                 .     2013353760000
                                2017574820000     2017574580000     2017574820000
                                            .                 .                 .
                            1998326999999.998 1998327059999.998 1998326999999.998
                                 2.017125e+12     2017125540000      2.017125e+12
                            2012950980000.002 2012950440000.002 2012950980000.002
                            2009906939999.998 2009907359999.998 2009906939999.998
                            1998137219999.998     1.9981377e+12 1998137219999.998
                                2006082660000 2006082600000.002     2006082660000
                            2016976379999.998 2016977520000.002 2016976379999.998
                                2019928440000     2019928680000     2019928440000
                                2.0187792e+12     2.0187798e+12     2.0187792e+12
                                1996244040000     1996248660000 1996253399999.998
                                1.9966068e+12     1996607640000     1.9966068e+12
                            1988384520000.002 1988391899999.998 1988384520000.002
                            1994944680000.002     1994945280000 1994944680000.002
                                2011720740000     2011721520000     2011720740000
                                2005736760000 2005737719999.998     2005736760000
                            2012994299999.998 2012995440000.002 2012994299999.998
                                2007648060000      2.007645e+12     2007648060000
                                2018804460000     2.0188083e+12     2018804460000
                            1994335619999.998     1994336520000 1994335619999.998
                                2009873880000     2.0098752e+12     2009873880000
                            2001591540000.002     2.0015928e+12 2001591540000.002
                                2002712520000     2002712640000     2002712520000
                            end


                            Thanks in advance
                            Last edited by Carolina Hincapie; 28 Nov 2024, 07:10.

                            Comment


                            • #15
                              They all look good to me once you have applied that format. The values with decimal parts .002 or .998 perhaps should be rounded to the nearest integer.

                              Comment

                              Working...
                              X