Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • loop for 24 prior months grouping variable

    I have event date data where there are multiple eventdates.

    I'd like to create a grouping variable based on the past 24 months starting in 2020.

    Given their are multiple events on the same date I can't tsset the data without first collapsing it.

    I posted this question for years and figured out an easy fix. I don't want to have to write 48 lines of code and generate 48 groups. I imagine there is a loop I could write.

    ro_n = unique identifier

    eventdate year ro_n weight
    18jun2017 2017 2 6
    18jun2017 2017 2 5
    29jan2020 2020 2 3
    29jan2020 2020 2 1
    29jan2020 2020 2 14
    29jan2020 2020 2 6
    29jan2020 2020 2 2
    29jan2020 2020 2 6
    29jan2020 2020 2 9
    29jan2020 2020 2 15
    29jan2020 2020 2 4
    26nov2022 2022 3 15
    26nov2022 2022 3 10
    26nov2022 2022 3 6
    26nov2022 2022 3 14
    26nov2022 2022 3 7
    29oct2023 2023 4 2
    29oct2023 2023 4 13

  • #2
    You posted a previous version of this at https://www.statalist.org/forums/for...evious-2-years Please don't ask similar questions in different threads without cross-referencing. We give advice to this effect in the FAQ.

    I looked at that and had no idea of what you want precisely. Unfortunately my reaction is still the same to this version.

    Where does 48 come from?

    You have daily dates and yearly dates but how do the other variables enter the problem?

    Do you want to bin and then collapse or do you want a rolling calculation?

    There is no point to guessing at code when the problem is so unclear.
    Last edited by Nick Cox; 16 Mar 2024, 08:17.

    Comment


    • #3
      Should have been clearer. Sorry about the lack of cross-reference. I want to calculate a rolling calculation that will tally up the counts of events or sum of weights by prior 24 months. For the group by year this is what I did.

      foreach var of varlist gyear_2023 gyear_2022 gyear_2021 gyear_2020 {
      bysort ro_n,: egen w`var'=total(weight) if `var'==1 & weight~=.
      bysort ro_n,: egen c`var'=count(ro_n) if `var'==1
      }
      Last edited by John MacDonald; 16 Mar 2024, 08:24.

      Comment


      • #4
        Thanks for the extra detail. A guess is worth trying now. If this isn't what you want, or close enough to adapt, then please supply a data example as here together with the values of the new variables as you want them.

        rangestat is from SSC and must be installed before you can use it.

        Code:
        clear 
        input str9 eventdate year ro_n weight
        18jun2017 2017 2 6
        18jun2017 2017 2 5
        29jan2020 2020 2 3
        29jan2020 2020 2 1
        29jan2020 2020 2 14
        29jan2020 2020 2 6
        29jan2020 2020 2 2
        29jan2020 2020 2 6
        29jan2020 2020 2 9
        29jan2020 2020 2 15
        29jan2020 2020 2 4
        26nov2022 2022 3 15
        26nov2022 2022 3 10
        26nov2022 2022 3 6
        26nov2022 2022 3 14
        26nov2022 2022 3 7
        29oct2023 2023 4 2
        29oct2023 2023 4 13
        end 
        
        gen work = daily(eventdate, "DMY")
        drop eventdate 
        rename work eventdate 
        
        rangestat (count) weight (sum) weight, int(eventdate -730 0) by(ro_n)
        
        list, sepby(ro_n)
        
             +-------------------------------------------------------+
             | year   ro_n   weight   eventd~e   weight~t   weight~m |
             |-------------------------------------------------------|
          1. | 2017      2        6      20988          2         11 |
          2. | 2017      2        5      20988          2         11 |
          3. | 2020      2        3      21943          9         60 |
          4. | 2020      2        1      21943          9         60 |
          5. | 2020      2       14      21943          9         60 |
          6. | 2020      2        6      21943          9         60 |
          7. | 2020      2        2      21943          9         60 |
          8. | 2020      2        6      21943          9         60 |
          9. | 2020      2        9      21943          9         60 |
         10. | 2020      2       15      21943          9         60 |
         11. | 2020      2        4      21943          9         60 |
             |-------------------------------------------------------|
         12. | 2022      3       15      22975          5         52 |
         13. | 2022      3       10      22975          5         52 |
         14. | 2022      3        6      22975          5         52 |
         15. | 2022      3       14      22975          5         52 |
         16. | 2022      3        7      22975          5         52 |
             |-------------------------------------------------------|
         17. | 2023      4        2      23312          2         15 |
         18. | 2023      4       13      23312          2         15 |
             +-------------------------------------------------------+
        See FAQ Advice #12 for why dataex is especially needed for data including dates; string dates need surgery before people can use them in code.

        Comment


        • #5
          Excellent. This is way more efficient then the loop I just wrote!

          Comment

          Working...
          X