Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rolling cumulative sum from using _n within panels

    I would like to generate a new variable which computes a rolling cumulative sum of observations of x from 3 days before up until 1 days before the observation in consideration within panels.
    For example if I have:

    ID Date x to_create
    1 1 1 .
    1 2 2 .
    1 3 3 .
    1 4 2 6
    1 5 1 7
    1 6 2 6
    2 1 2 .
    2 2 3 .
    2 3 4 .
    2 4 3 9
    2 5 2 10
    2 6 3 9

    to_create is the variable I would like to obtain.

    Going through statalist I found the following commands:

    bysort ID (Date): gen to_create = sum(x[_n-3] - x[_n-1])

    However using this command, stata simply subtracts x[_n-1] from x[_n-3]

    I have tried numerous other commands but keep on getting only missing values or all values 0. Is there any other way to compute to_create?
    As I am new to this forum I hope to have expressed my problem clearly. T
    Thanks in advance!

  • #2
    Please use dataex in the future to present example data (see FAQ Advice # 12). You are better off using Stata's time series operators for this.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(id date x to_create)
    1 1 1  .
    1 2 2  .
    1 3 3  .
    1 4 2  6
    1 5 1  7
    1 6 2  6
    2 1 2  .
    2 2 3  .
    2 3 4  .
    2 4 3  9
    2 5 2 10
    2 6 3  9
    end
    
    xtset id date
    gen wanted = l1.x +l2.x +l3.x
    Res.:

    Code:
    . l, sepby(id)
    
         +-----------------------------------+
         | id   date   x   to_cre~e   wanted |
         |-----------------------------------|
      1. |  1      1   1          .        . |
      2. |  1      2   2          .        . |
      3. |  1      3   3          .        . |
      4. |  1      4   2          6        6 |
      5. |  1      5   1          7        7 |
      6. |  1      6   2          6        6 |
         |-----------------------------------|
      7. |  2      1   2          .        . |
      8. |  2      2   3          .        . |
      9. |  2      3   4          .        . |
     10. |  2      4   3          9        9 |
     11. |  2      5   2         10       10 |
     12. |  2      6   3          9        9 |
         +-----------------------------------+

    Comment


    • #3
      Thank you for the quick response, I will use dataex next time! However, for my actual dataset the cumulative sum I need to make ranges from _n-125 to _n-10.
      I am therefore looking for a way to include a range into a sum function. Is this also a possibility?

      Comment


      • #4
        Please ssc install rangestat, and help rangestat.
        Ho-Chuan (River) Huang
        Stata 19.0, MP(4)

        Comment


        • #5
          As advised in #4, install rangestat from SSC by Robert Picard, Nick Cox and Roberto Ferrer.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input byte(id date x to_create)
          1 1 1  .
          1 2 2  .
          1 3 3  .
          1 4 2  6
          1 5 1  7
          1 6 2  6
          2 1 2  .
          2 2 3  .
          2 3 4  .
          2 4 3  9
          2 5 2 10
          2 6 3  9
          end
          
          *ssc install rangestat
          rangestat (sum) x, interval(date -3 -1) by(id)
          *Lags consider only full windows (defined by lower limit of interval)
          replace sum_x=. if date<=3
          Res.:

          Code:
          . l, sepby(id)
          
               +----------------------------------+
               | id   date   x   to_cre~e   x_sum |
               |----------------------------------|
            1. |  1      1   1          .       . |
            2. |  1      2   2          .       . |
            3. |  1      3   3          .       . |
            4. |  1      4   2          6       6 |
            5. |  1      5   1          7       7 |
            6. |  1      6   2          6       6 |
               |----------------------------------|
            7. |  2      1   2          .       . |
            8. |  2      2   3          .       . |
            9. |  2      3   4          .       . |
           10. |  2      4   3          9       9 |
           11. |  2      5   2         10      10 |
           12. |  2      6   3          9       9 |
               +----------------------------------+
          In your case, therefore,

          Code:
          rangestat (sum) x, interval(date -125 -10) by(id)
          replace sum_x=. if date<=125
          Last edited by Andrew Musau; 10 Apr 2020, 21:49.

          Comment


          • #6
            Thanks a lot!!

            Comment


            • #7
              Excellent advice from River Huang and Andrew Musau but note that counting in observation numbers and counting in time units will give different results if there are gaps in the data.

              Comment


              • #8
                Thank you!

                Comment

                Working...
                X