Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating quarterly data

    Dear all,

    I was wondering if anyone could help me with coding the following problem? I have a state by year dataset example given below. I would like to compute quarterly data where quarter 1 starts 1st April. so quarter 1 is April, May, June, quarter 2 is July, August, September
    This variable would look like: YEARquarter, So, if it were based on calendar year: I could do: gen quarter=qofd(date) and quarter starts in Jan but I want it to start in April for every state.

    Thank you for your help.

    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str1 State int date
    "A" 18780
    "A" 18547
    "B" 19335
    "C" 18780
    "C" 19335
    "D" 18547
    "D" 18547
    "D" 18547
    "E" 18780
    "E" 18780
    "E" 18780
    "E" 18780
    "F" 19335
    "F" 19335
    "F" 19335
    "F" 18547
    "G" 18547
    "G" 18547
    "i" 19335
    "i" 18547
    "i" 18780
    "H" 18780
    "H" 19335
    "H" 18547
    "H" 18547
    "j" 18547
    "k" 18780
    "l" 18780
    "m" 18547
    "n" 19335
    "o" 19335
    end
    format %tddd-Mon-YY date
    Last edited by Tariku Getaneh; 14 Mar 2024, 07:29.

  • #2
    Does this help? You can assign the string values as value labels using labmask from the Stata Journal.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str1 State int date
    "A" 18780
    "A" 18547
    "B" 19335
    "C" 18780
    "C" 19335
    "D" 18547
    "D" 18547
    "D" 18547
    "E" 18780
    "E" 18780
    "E" 18780
    "E" 18780
    "F" 19335
    "F" 19335
    "F" 19335
    "F" 18547
    "G" 18547
    "G" 18547
    "i" 19335
    "i" 18547
    "i" 18780
    "H" 18780
    "H" 19335
    "H" 18547
    "H" 18547
    "j" 18547
    "k" 18780
    "l" 18780
    "m" 18547
    "n" 19335
    "o" 19335
    end
    format %tddd-Mon-YY date
    
    gen qdate = qofd(date)
    gen q = quarter(date)
    gen y = year(date)
    
    gen label = cond(q == 1, strofreal(y - 1) + "q4", strofreal(y) + "q" +  strofreal(q - 1))
    
    tabdisp(date), c(qdate y q label)
    
    
    ----------------------------------------------------------
         date |      qdate           y           q       label
    ----------+-----------------------------------------------
    12-Oct-10 |        203        2010           4      2010q3
     2-Jun-11 |        205        2011           2      2011q1
     8-Dec-12 |        211        2012           4      2012q3
    ----------------------------------------------------------

    Comment


    • #3
      Thank you Nick, can that be done without user written command?

      Comment


      • #4
        Naturally. Just loop over the distinct values. Code not tested at all.

        Code:
        levelsof qdate , local(dates) 
        
        foreach d of local dates { 
              local q = quarter(`d')
              local y = year(`d')
              local label = cond(`q' == 1, strofreal(`y' - 1) + "q4", strofreal(`y') + "q" + strofreal(`q' - 1))
              label def qdate `d' "`label'", add
        }
        
        label val qdate qdate

        Comment


        • #5
          Thank you Nick, It turns out that qdate vales take on 1960q2 for all observations in my dataset. Is there any explanation why the values are always 1960q1?
          Last edited by Tariku Getaneh; 14 Mar 2024, 09:06.

          Comment


          • #6
            I can’t explain that without details. #2 indicates otherwise.

            Comment


            • #7
              Sorry; the question in #6 does make perfect sense.

              The code in #4 starts with a variable that contains quarterly dates. But as the help explains, the functions quarter() and year() require daily date input.

              So in terms of your data example this is the complete story with no use of community-contributed commands.

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input str1 State int date
              "A" 18780
              "A" 18547
              "B" 19335
              "C" 18780
              "C" 19335
              "D" 18547
              "D" 18547
              "D" 18547
              "E" 18780
              "E" 18780
              "E" 18780
              "E" 18780
              "F" 19335
              "F" 19335
              "F" 19335
              "F" 18547
              "G" 18547
              "G" 18547
              "i" 19335
              "i" 18547
              "i" 18780
              "H" 18780
              "H" 19335
              "H" 18547
              "H" 18547
              "j" 18547
              "k" 18780
              "l" 18780
              "m" 18547
              "n" 19335
              "o" 19335
              end
              format %tddd-Mon-YY date
              
              gen qdate = qofd(date)
              
              levelsof qdate , local(dates)
              
              foreach d of local dates {
                    local q = quarter(dofq(`d'))
                    local y = year(dofq(`d'))
                    local label = cond(`q' == 1, strofreal(`y' - 1) + "q4", strofreal(`y') + "q" + strofreal(`q' - 1))
                    label def qdate `d' "`label'", add
              }
              
              label val qdate qdate
              There are other ways to do it, but one solution is usually enough.

              Comment


              • #8
                Thanks very much Nick. This has worked well. I would truly appreciate it if you could explain this line of code, but no problem if time doesn't permit:
                local label = cond(`q' == 1, strofreal(`y' - 1) + "q4", strofreal(`y') + "q" + strofreal(`q' - 1))

                Comment


                • #9
                  If the quarter is 1, return as a string (with "q" as punctuation) the previous year and quarter 4

                  otherwise, return as a string (ditto) this year and as quarter the quarter minus 1

                  so that for Stata's 2022q1 you see 2021q4 and for Stata's 2022q2 you see 2022q1.

                  Comment


                  • #10
                    Many thanks Nick. Can one create a year variable by extracting the year component from the quarterly data? gen y=substr(quarter, 1, 4) doesn't work at all.

                    Comment


                    • #11
                      No; that is doubly confused. The quarter variable is numeric and so taking a substring is not possible. In any case quarter just holds numeric values 1 2 3 4 or possibly missing, so there is no year information to extract.

                      if you have a Stata quarterly date you can extract the year using year(dofq()). There are other ways to do it too.

                      Comment


                      • #12
                        Here's another way to do it, which follows from quarterly date 0 being the first quarter of 1960.

                        Code:
                        clear 
                        set obs 16 
                        gen qdate_num = cond(_n <= 8, yq(1958, 4) + _n, yq(2020, 4) + _n)
                        clonevar qdate_for = qdate_num
                        format qdate_for %tq 
                        
                        gen year = 1960 + floor(qdate_num / 4)
                        
                        list, sepby(year)
                        
                             +----------------------------+
                             | qdate_~m   qdate_~r   year |
                             |----------------------------|
                          1. |       -4     1959q1   1959 |
                          2. |       -3     1959q2   1959 |
                          3. |       -2     1959q3   1959 |
                          4. |       -1     1959q4   1959 |
                             |----------------------------|
                          5. |        0     1960q1   1960 |
                          6. |        1     1960q2   1960 |
                          7. |        2     1960q3   1960 |
                          8. |        3     1960q4   1960 |
                             |----------------------------|
                          9. |      252     2023q1   2023 |
                         10. |      253     2023q2   2023 |
                         11. |      254     2023q3   2023 |
                         12. |      255     2023q4   2023 |
                             |----------------------------|
                         13. |      256     2024q1   2024 |
                         14. |      257     2024q2   2024 |
                         15. |      258     2024q3   2024 |
                         16. |      259     2024q4   2024 |
                             +----------------------------+

                        Comment


                        • #13
                          Many thanks Nick. I learned from this forum that gen year=year(dofq(quarter)) would give us the year from the quarterly date. But in my case - non- calendar year quarter- it leads to erroneous year. I wanted to extract the exact year from the quarters you have created at #7. I have used your approach in #12, but it returns a calendar year, but not the year associated with quarters you computed at #7.

                          Comment


                          • #14
                            Not so. I don't change the quarterly date at all in my code. What I do is create value labels for it showing a non-calendar year. So, the calendar year can still be pulled out of the quarterly date.

                            I recommend that variable names like quarter be used only in the sense of Stata's quarter() function which yields 1 2 3 4 or missing. What names you use for quarterly dates are up to you, but my personal convention is to use names like qdate.

                            Comment

                            Working...
                            X