Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extract year and month of date

    Hi

    I have dates like this 1997m7 (formatted as %tm). Now I would like to get the year and the month in seperate columns. I tried this by using
    Code:
    gen year=year(date) 
    gen month=month(date)
    The problem is that the generated numbers are not correct. E.g. for 1997m7 Stata generates year=1961 and month=3.

    What can I do about this?

    Thanks

  • #2
    http://www.ats.ucla.edu/stat/stata/f...s_ym2mandy.htm

    Comment


    • #3
      This is all documented at help dates and times. year() and month() are for extraction from daily date variables. You must convert monthly dates to daily dates before you use those functions. Alternatively, the fact that monthly dates have origin 0 in January 1960 yields direct calculations:

      [CODE]
      . di floor(ym(2015,1)/12)
      55

      . di floor(ym(2015,12)/12)
      55

      . di floor(ym(2014,1)/12)
      54

      . di floor(ym(2014,12)/12)
      54

      . di mod(ym(2014,12), 12)
      11

      . di mod(ym(2014,1), 12)
      0
      [CODE]

      Hence if date is a monthly date variable 1960 + floor(date/12) yield the year and 1 + mod(date, 12) yields the month.

      Comment


      • #4
        I know this is an old conversation but Basil the same thing happens with my data. Did you find a solution? I changed the format to "Daily Dates" as Nick suggested but the same problem persisted.

        Comment


        • #5
          The link in #2 gives a solution, as does Nick's advice in #3. But probably simpler than either of these is:

          Code:
          gen yr = yofd(dofm(original_monthly_date))
          gen mnth = mofd(dofm(original_monthly_date))

          Comment


          • #6
            Note that nowhere did I suggest that changing the format would be a solution. On the conflation of changing formats and changing stored values, two quite different acts, see e.g. http://www.stata-journal.com/sjpdf.h...iclenum=dm0067

            Comment


            • #7
              Hi Clyde,

              I'm facing the same problem as Basil and tried using the commands that you suggested -- I've managed to extract the year, but my month variable starts from 541 and counts upwards all the way to 660, at which it restarts at 541. Is there any way to fix this such that it extracts the month properly?

              The code I used was:

              Code:
              gen yr = yofd(dofm(mdate))
              gen mnth = mofd(dofm(mdate))
              where mdate is my original time variable formatted as %tm, e.g. 2007m1
              Last edited by Claire James; 08 Feb 2018, 20:34.

              Comment


              • #8
                The code for year is correct. To extract the month (i.e. 2, as opposed to Feb2005) you want:

                Code:
                gen mtnh = month(dofm(mdate))

                Comment


                • #9
                  The code worked, thanks so much Clyde!

                  Comment


                  • #10
                    #3 already gave another solution for Claire James. Further, consider

                    Code:
                    . clear
                    
                    . set obs 2 
                    number of observations (_N) was 0, now 2
                    
                    . gen mdate = real(word("541 660", _n))  
                    
                    . l
                    
                         +-------+
                         | mdate |
                         |-------|
                      1. |   541 |
                      2. |   660 |
                         +-------+
                    
                    . format mdate %tm
                    
                    . l
                    
                         +--------+
                         |  mdate |
                         |--------|
                      1. | 2005m2 |
                      2. | 2015m1 |
                         +--------+
                    
                    . extrdate month month = mdate 
                    
                    . 
                    . list 
                    
                         +----------------+
                         |  mdate   month |
                         |----------------|
                      1. | 2005m2       2 |
                      2. | 2015m1       1 |
                         +----------------+
                    Here extrdate is part of the numdate package on SSC, i.e.

                    The doubling of month is not a typo here. extrdate wants to be told which element you want extracted and what variable name you want to use for a new variable, and it's convenient to use the same name.



                    . ssc desc numdate

                    ----------------------------------------------------------------------------------------
                    package numdate from http://fmwww.bc.edu/repec/bocode/n
                    ----------------------------------------------------------------------------------------

                    TITLE
                    'NUMDATE': module to generate date-times or components

                    DESCRIPTION/AUTHOR(S)

                    numdate is for generating a new Stata numeric date-time variable
                    from one or more existing variables containing date or
                    date-time or time information. convdate is for
                    converting an existing numeric date-time variable,
                    generating a new variable of another date-time type.
                    extrdate is for generating a variable containing a date or time
                    component from an existing date-time variable.

                    KW: dates
                    KW: times
                    KW: generate
                    KW: timeseries calendar

                    Requires: Stata version 12

                    Distribution-Date: 20170925

                    Author: Nicholas J. Cox, Durham University
                    Support: email [email protected]


                    INSTALLATION FILES (type net install numdate)
                    numdate.ado
                    numdate.sthlp
                    ../c/convdate.ado
                    ../c/convdate.sthlp
                    ../e/extrdate.ado
                    ../e/extrdate.sthlp
                    ----------------------------------------------------------------------------------------
                    (type ssc install numdate to install)

                    .
                    See https://www.statalist.org/forums/for...date-variables especially #5 for the announcement of extrdate

                    Comment


                    • #11
                      Dear Clyde,
                      I also face a similar problem. I have dates like this 20041231 and want to extract the month in the format 12.
                      Unfortunately, the code you suggest creates missing values. Can you help me and tell me what's the problem?
                      Many thanks

                      Comment


                      • #12
                        Dominik: You're replying to Clyde Schechter in #8. But #10 already pointed you to a solution, the numdate package from SSC.

                        Code:
                        clear 
                        input long date 
                        20141231 
                        20180316
                        end 
                        
                        format date %8.0f 
                        
                        numdate daily newdate=date, pattern(YMD) 
                        extrdate month month=newdate 
                        
                        list 
                        
                             +------------------------------+
                             |     date     newdate   month |
                             |------------------------------|
                          1. | 20141231   31dec2014      12 |
                          2. | 20180316   16mar2018       3 |
                             +------------------------------+

                        Comment


                        • #13
                          Hi,
                          I have a similar problem regarding extracting year and month. My original data looked like this. It's a string variable but "2016/5" actually means May 2016. Now I wish there is a way to convert "2016/5" into May2016, through either changing the way of displaying or generating a new variable.
                          Code:
                          clear
                          input str7 getin
                          "2016/5" 
                          "2016/1" 
                          "2016/4" 
                          "2016/6" 
                          "2016/2" 
                          "2016/11"
                          end
                          I realized that the "/" in the data can be troublesome. So I tried
                          Code:
                          gen inyear = substr( getin, 1,4 )
                          gen inmonth = substr( getin, 6,2 ) 
                          gen getin1=inyear+inmonth
                          destring getin1, replace force float
                          It would result into a format like 20165. I then tried to use -format getin1 %tdCCYYNN- and -format getin1 %tmCCYYnn-, both produced incorrect results though. Would you please help me with this?

                          Many thanks,
                          Ginny

                          Comment


                          • #14
                            Ginny in #13. There is no shortcut here that avoids reading and understanding help datetime. There is a quick solution to your problem based on that reading:

                            Code:
                            clear
                            input str7 getin
                            "2016/5" 
                            "2016/1" 
                            "2016/4" 
                            "2016/6" 
                            "2016/2" 
                            "2016/11"
                            end
                            
                            gen mdate = monthly(getin, "YM") 
                            format mdate %tm 
                            
                            list, sep(0) 
                            
                                 +-------------------+
                                 |   getin     mdate |
                                 |-------------------|
                              1. |  2016/5    2016m5 |
                              2. |  2016/1    2016m1 |
                              3. |  2016/4    2016m4 |
                              4. |  2016/6    2016m6 |
                              5. |  2016/2    2016m2 |
                              6. | 2016/11   2016m11 |
                                 +-------------------+
                            The rest of the post is about what was wrong with your code.

                            There are several difficulties with what you tried. If the issue were getting rid of the slash then

                            Code:
                            gen getin1 = subinstr(getin, "/", "", .)
                            is more direct than your code. But it is not the issue. The slash is helpful and lets monthly() do its job!

                            20165, for example, is not a daily date to you; it's readable by people as a monthly date. Assigning it a daily date format will only produce incorrect results:

                            Code:
                            . di %td 20165
                            18mar2015
                            
                            . di %td 19605
                            04sep2013
                            The first example is wrong but not amazingly wrong, but the second example shows that the idea is wrong. A daily date is a count of days since 1 January 1960. If you give Stata an integer and say "treat this as a daily date" there is no intelligence in there to correct you and say "But 20165 surely means May 2016!".

                            Note that a monthly date format would produce even worse results:

                            Code:
                            . di %tm 20165
                             3640m6
                            
                            . di %tm 19605
                            3593m10
                            A monthly date is a count of months since January 1960. 20165 in those terms is a long way into the future.

                            It's a fallacy that changing the display format will convert one date (or date-like value) into another. For more on that see https://www.stata-journal.com/sjpdf....iclenum=dm0067

                            Comment


                            • #15
                              Thanks Mr. Nick Cox in #14 for the solution and the explanation. I finally got 2016May after using -format mdate %tmCCYYMon-.

                              I used the date() function to convert dates in string format to the "daily date" before but I failed to understand the logic of "monthly date". I now realized that if we want STATA to deal with time we have to convert the string variable representing date into "STATA internal form" (count of seconds, days, months or years from 1960) first and then if necessary, change the format of "human readable form" (%td, %tm, etc.).

                              I noticed that the "Also see" part of -help datetime- is important too:
                              [D] datetime translation String to numeric date translation functions
                              [D] datetime display formats Display formats for dates and times

                              The STATA journal link is great since that article actually explained what happened to "monthly date" and how to use "%tm". Most explanations, like https://stats.idre.ucla.edu/stata/fa...can-recognize/ , tend to deal with "date" data explain how to use "%td".

                              Comment

                              Working...
                              X