Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Changing months into weeks

    Hello,

    I have 5 million + rows with year variables that include 15 years of data and month variables with 12 months in each year. I want to generate and assign 4-5 correct weeks codes for each month in all 15 years. From data ex a sample sample:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str6 ind float(year month)
    "M00647" 2007  7
    "M00647" 2007  8
    "M00647" 2007  9
    "M00647" 2007 10
    "M00647" 2007 11
    "M00647" 2007 12
    "M00647" 2008  1
    "M00647" 2008  2
    "M00647" 2008  3
    "M00647" 2008  4
    "M00647" 2008  5
    "M00647" 2008  6
    "M00887" 2007  7
    "M00887" 2007  8
    "M00887" 2007  9
    "M00887" 2007 10
    "M00887" 2007 11
    "M00887" 2007 12
    "M00887" 2008  1
    "M00887" 2008  2
    "M00887" 2008  3
    "M00887" 2008  4
    "M00887" 2008  5
    "M00888" 2008  6
    "M00888" 2008  7
    "M00888" 2008  8
    "M00888" 2008  9
    "M00888" 2008 10
    "M00888" 2008 11
    "M00888" 2008 12
    end
    I want to assign each row 4 or 5 new variables that correspond to the 4 or 5 weeks within each historical year and month.

    For now, I don't know what day the first Monday fell in any of the months, so I had to arbitrarily assign the first day of the month as the start of the first week.

    What I did is:

    g day = 1
    g week1 = mdy(month , day , year)
    format %9.0f week1
    replace week1 = (week1-2182)/7
    drop day year month

    * creating 4-5 week codes within each month - 4 weeks in a month with some having 5 weeks = dropped extra 5th week later

    forval i =2/5{
    g week`i' = round(week1 + `i' - 1, 1)
    }

    This way is wrong because if the start if the first Monday occurred on for ex. March 3, I would still have that the full week occurred March 1-7, throwing off all the weeks that follow.

    Is there any way to turn months into the correct weeks?

    Please help!

    Thank you!


  • #2
    Working with weeks is painful, no matter how you slice it. Your post leaves many questions unanswered. It is clear you do not want the first week of a month to start on day 1 of that month, because you say "I would still have that the full week occurred March 1-7, throwing off all the weeks that follow." OK, but then, when do you want it to start? What is a "correct week?"

    Let's look at this month, November 2023. November 1st was a Wednesday. If you don't want the first week to be November 1-7, then when should it be? Perhaps you want the week to start always on a Monday. The first Monday of November 2023 is November 6. So I could give you code that would begin Week 1 of November on November 6. But then what happens to November 1 through 4? Are they part of the last week in October? Or how else might you handle them.

    Code:
    gen week1_start = firstweekdayofmonth(month, year, 1) // CHANGE 1 TO 0 TO GET SUNDAY INSTEAD.
    forvalues i = 2/5 {
        gen week`i'_start = week1_start + (`i'-1)*7
    }
    format week*_start %td
    replace week5_start = . if mofd(week5_start) > mofd(week1_start)
    This code will calculate 4 or 5 week#_start dates. The week1_start is always the date of the first Monday of the month, and each of the other weeks begins 7 days after the preceding. If the 5th week's start date would fall in the next month, it is set to missing value.

    Note: Requires version 18.

    Hope this helps.

    Added: I do not get what you are doing with that -replace ... 2182...- 2182 is the Stata internal format date representation for 22dec1965. I guess you want to start numbering your weeks from there? But 22dec1965 is actually a Wednesday?!? Can you clarify what's going on with that?
    Last edited by Clyde Schechter; 10 Nov 2023, 15:17.

    Comment


    • #3
      Thank you very much. Sorry if it was unclear. Yes, Monday is the first day I want to tag in each month. I see that what you did is EXACTLY what I need... But I dont have STATA 18. I have STATA 17, is there any work around?

      Comment


      • #4
        Also to clarify the last part, I use week1 = (week1-2182)/7 to get to a week that later represents a week code that matches another data set. I thought it was stata's standard, but it must be internal.

        Comment


        • #5
          Yes, the following will run in version 17:
          Code:
          gen week1_start = firstdayofmonth(mdy(month, 1, year))
          replace week1_start = cond(dow(week1_start) > 1, week1_start+8-dow(week1_start), week1_start)
          replace week1_start = week1_start + 1 if dow(week1_start) == 0
          forvalues i = 2/5 {
              gen week`i'_start = week1_start + (`i'-1)*7
          }
          format week*_start %td
          replace week5_start = . if mofd(week5_start) > mofd(week1_start)

          Comment


          • #6
            Thank you so much again.

            This code works, and seems to do what I need it. But could you please explain what the "Week1_start + 1 is doing, why "1" and why is dow = 0.

            Comment


            • #7
              The -dow()- function takes a date as an argument and returns the day of the week coded as 0 = Sunday, 1 = Monday, ... 6 = Saturday.

              Suppose the first day of the month is a Sunday. Then the first Monday of the month will be the day after. We will know the first day of the month is a Sunday if and only if dow(week1_start) == 0. So in that situation, adding 1 to the value of week1_start brings it to that first Monday.

              Comment


              • #8
                Thank You!

                Comment

                Working...
                X