Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting yyww and yyyymm values to stata serial number using first day of week and month, respectively

    Hi all

    I have a date-variable 'x_yyww' taking on values where it is showing year and week in the format yyww. For example, when the variable takes the value 1450 it is week 50 in the year 2014. I want to convert this variable, so it instead shows the stata serial number for the first day of the week (with mondays beginning the week). Continuing the example the value 1450 should be converted to the stata serial number for the date 8/12/2014 (in the dd/mm/yyyy format).

    I have another variable 'x_yyyymm' taking on values where it is showing the year and month in the format yymmm. For example, when the variable takes the value 201412 it is december 2014. Again, I want to convert this variable, so in instead shows the stata serial number for the first day of the month. For the value 201412 this would be the stata serial number for 1/12/2014.

    Dataexample for the month variable:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte pnr long x_yyyymm byte arbsted
    1 201809 1
    1 201810 0
    1 201811 1
    2 201809 0
    2 201810 1
    2 201811 1
    4 201809 0
    4 201810 1
    4 201811 0
    5 201809 1
    5 201810 1
    5 201811 1
    7 201809 1
    7 201810 1
    7 201811 0
    8 201809 0
    8 201810 0
    8 201811 1
    end
    Data example for the week variable:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte pnr int x_yyww
    1 1836
    1 1837
    1 1838
    1 1839
    1 1840
    1 1841
    1 1842
    1 1843
    1 1844
    1 1845
    1 1846
    1 1847
    1 1848
    2 1836
    2 1837
    2 1838
    2 1839
    2 1840
    2 1841
    2 1842
    2 1843
    2 1844
    2 1845
    2 1846
    2 1847
    2 1848
    4 1836
    4 1837
    4 1838
    4 1839
    4 1840
    4 1841
    4 1842
    4 1843
    4 1844
    4 1845
    4 1846
    4 1847
    4 1848
    5 1836
    5 1837
    5 1838
    5 1839
    5 1840
    5 1841
    5 1842
    5 1843
    5 1844
    5 1845
    5 1846
    5 1847
    5 1848
    7 1836
    7 1837
    7 1838
    7 1839
    7 1840
    7 1841
    7 1842
    7 1843
    7 1844
    7 1845
    7 1846
    7 1847
    7 1848
    8 1836
    8 1837
    8 1838
    8 1839
    8 1840
    8 1841
    8 1842
    8 1843
    8 1844
    8 1845
    8 1846
    8 1847
    8 1848
    end
    Last edited by Emil Alnor; 16 Mar 2022, 11:38.

  • #2
    Monthly dates

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte pnr long x_yyyymm byte arbsted
    1 201809 1
    1 201810 0
    1 201811 1
    2 201809 0
    2 201810 1
    2 201811 1
    4 201809 0
    4 201810 1
    4 201811 0
    5 201809 1
    5 201810 1
    5 201811 1
    7 201809 1
    7 201810 1
    7 201811 0
    8 201809 0
    8 201810 0
    8 201811 1
    end
    
    gen mdate = ym(floor(x_yyyymm/100), mod(x_yyyymm, 100))
    
    format mdate %tm
    
    list
    
         +------------------------------------+
         | pnr   x_yyyymm   arbsted     mdate |
         |------------------------------------|
      1. |   1     201809         1    2018m9 |
      2. |   1     201810         0   2018m10 |
      3. |   1     201811         1   2018m11 |
      4. |   2     201809         0    2018m9 |
      5. |   2     201810         1   2018m10 |
         |------------------------------------|
      6. |   2     201811         1   2018m11 |
      7. |   4     201809         0    2018m9 |
      8. |   4     201810         1   2018m10 |
      9. |   4     201811         0   2018m11 |
     10. |   5     201809         1    2018m9 |
         |------------------------------------|
     11. |   5     201810         1   2018m10 |
     12. |   5     201811         1   2018m11 |
     13. |   7     201809         1    2018m9 |
     14. |   7     201810         1   2018m10 |
     15. |   7     201811         0   2018m11 |
         |------------------------------------|
     16. |   8     201809         0    2018m9 |
     17. |   8     201810         0   2018m10 |
     18. |   8     201811         1   2018m11 |
         +------------------------------------+
    What is your definition of the week? Is week 53 allowed. for example?

    Comment


    • #3
      You can get the first day of each month by dofm() but for handling monthly dates in Stata that is usually a big step in the wrong direction. For example the spacing of such dates varies between 28 and 31 and as declared to tsset or xtset any such dataset is mostly gaps and time series operators just don't work as you want.

      The definition of week used by the data supplier is crucial and I bet that it is **not** the definition used by Stata. The example data don't specify, so far as I can see,

      1. What day of week starts a week

      2. What happens at the beginning of the year if it starts with an incomplete week

      3. What happens at the end of the year if it ends with an incomplete week (e.g. is there ever a week 53?)

      I've seen schemes in which the first few days of a year were included in the last week of the previous year;

      Comment


      • #4
        Originally posted by Nick Cox View Post

        What is your definition of the week? Is week 53 allowed. for example?
        Thanks for the provided code!

        The weeks begin with mondays, week 1 is the first full week of the year (e.g. week 1 2021 begins with 4/1/2021) and week 53 is allowed (e.g. 28/12/2020-3/1/2021).

        By the way, this is - what I think - is the first step in merging my monthly and weekly data using the procedure you specified in this post: https://www.statalist.org/forums/for...h-monthly-data. I am working with a definition where the weeks belong to the month if they have a monday in the month. E.g. in the year 2021, week 14, 15, 16 and 17 belong to april, and week 18, 19, 20, 21 and 22 belong to may.



        Comment


        • #5
          Thanks. Let's see how far we can get with that.

          My strong advice is to avoid week numbers but to represent each week by the day it starts.

          Then declare delta(7) to tsset or xtset if you need that.

          I took your data and added some more to push harder at my code.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input int x_yyww
          1836
          1837
          1838
          1839
          1840
          1841
          1842
          1843
          1844
          1845
          1846
          1847
          1848
          1801
          1852
          1853
          1901
          1952
          end
          
          
          * the year just comes from dividing by 100 and rounding down 
          . gen year = 2000 + floor(x_yyww/100) 
          
          * the week comes from the remainder on dividing by 100 
          . gen week = mod(x_yyww, 100) 
          
          * the first day of the year _might_ be a Monday 
          . gen firstday = mdy(1, 1, year)  
          . gen firstMonday = firstday 
          
          * but if it is a Sunday we need to wait one day 
          . replace firstMonday = firstday + 1 if dow(firstday) == 0 
          
          * if it is Tuesday ... Saturday dow 2 3 4 5 6 we need to wait 6 5 4 3 2 days -- so 8 minus day of week 
          . replace firstMonday = firstday + 8 - dow(firstday) if dow(firstday) > 1 
          
          * the daily date for each week is then some multiple of 7 days after the first Monday 
          . gen wanted = firstMonday + 7 * (week - 1) 
          
          . 
          . format firstday-wanted %tddd_Mon_CCYY  
          
          . 
          . list, sepby(year) 
          
               +--------------------------------------------------------------+
               | x_yyww   year   week     firstday   firstMon~y        wanted |
               |--------------------------------------------------------------|
            1. |   1836   2018     36   1 Jan 2018   1 Jan 2018    3 Sep 2018 |
            2. |   1837   2018     37   1 Jan 2018   1 Jan 2018   10 Sep 2018 |
            3. |   1838   2018     38   1 Jan 2018   1 Jan 2018   17 Sep 2018 |
            4. |   1839   2018     39   1 Jan 2018   1 Jan 2018   24 Sep 2018 |
            5. |   1840   2018     40   1 Jan 2018   1 Jan 2018    1 Oct 2018 |
            6. |   1841   2018     41   1 Jan 2018   1 Jan 2018    8 Oct 2018 |
            7. |   1842   2018     42   1 Jan 2018   1 Jan 2018   15 Oct 2018 |
            8. |   1843   2018     43   1 Jan 2018   1 Jan 2018   22 Oct 2018 |
            9. |   1844   2018     44   1 Jan 2018   1 Jan 2018   29 Oct 2018 |
           10. |   1845   2018     45   1 Jan 2018   1 Jan 2018    5 Nov 2018 |
           11. |   1846   2018     46   1 Jan 2018   1 Jan 2018   12 Nov 2018 |
           12. |   1847   2018     47   1 Jan 2018   1 Jan 2018   19 Nov 2018 |
           13. |   1848   2018     48   1 Jan 2018   1 Jan 2018   26 Nov 2018 |
           14. |   1801   2018      1   1 Jan 2018   1 Jan 2018    1 Jan 2018 |
           15. |   1852   2018     52   1 Jan 2018   1 Jan 2018   24 Dec 2018 |
           16. |   1853   2018     53   1 Jan 2018   1 Jan 2018   31 Dec 2018 |
               |--------------------------------------------------------------|
           17. |   1901   2019      1   1 Jan 2019   7 Jan 2019    7 Jan 2019 |
           18. |   1952   2019     52   1 Jan 2019   7 Jan 2019   30 Dec 2019 |
               +--------------------------------------------------------------+
          Much more at (do the search yourself to get clickable links)



          Code:
          search week, sj
          Search of official help files, FAQs, Examples, SJs, and STBs

          SJ-12-4 dm0065_1 . . . . . Stata tip 111: More on working with weeks, erratum
          . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
          Q4/12 SJ 12(4):765 (no commands)
          lists previously omitted key reference

          SJ-12-3 dm0065 . . . . . . . . . . Stata tip 111: More on working with weeks
          . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
          Q3/12 SJ 12(3):565--569 (no commands)
          discusses how to convert data presented in yearly and weekly
          form to daily dates and how to aggregate such data to months
          or longer intervals

          SJ-10-4 dm0052 . . . . . . . . . . . . . . . . Stata tip 68: Week assumptions
          . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
          Q4/10 SJ 10(4):682--685 (no commands)
          tip on Stata's solution for weeks and on how to set up
          your own alternatives given different definitions of the
          week

          For more on weeks in months. there is a piece in Stata Journal 22(1) (may appear in about 2 weeks' time)

          Comment


          • #6
            Thanks a lot for providing this extensive material! I will look into it and hopefully see that it solves my problem.

            Comment

            Working...
            X