Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating number of events within time period

    Hi,

    I have a dataset arranged as event-month i.e., each row represents a month. There are following variables in the dataset:
    1. event variable is a count of events i.e., (min. 0 (no event in that particular month), max. 17 (many events in the same month)).
    2. timebetween variable is a incremental count of number of months elapsed since the last event occurred i.e. time between two non-zero values of the event variable. Example: say first event(s) occurred in January and the second event(s) occurred in June then the timebetween these events is counted as 5 months.
    3. Year variable corresponding to the year in which the event(s) occurred.
    4. Month variable corresponding to the month in which the event(s) occurred.
    year month event timebetween
    1930 1 0 10
    1930 2 0 11
    1930 3 0 12
    1930 4 0 13
    1930 5 0 14
    1930 6 0 15
    1930 7 0 16
    1930 8 1 0
    1930 9 0 1
    1930 10 0 2
    1930 11 0 3
    1930 12 0 4
    1931 1 1 0
    1931 2 1 0
    1931 3 4 0
    1931 4 0 1
    1931 5 2 0
    1931 6 0 1
    1931 7 0 2
    1931 8 0 3
    1931 9 0 4
    1931 10 0 5
    1931 11 0 6
    1931 12 0 7
    1932 1 0 8
    1932 2 2 0
    1932 3 0 1
    1932 4 1 0
    1932 5 1 0
    1932 6 0 1


    I want to calculate the total number of events in say last six months from each event. For example, for the observation on bold (row values 1931 - 3 - 4 - 0), the number of events in last six months are two (1+1).

    An automated code for this calculation would help me a lot. Thanks in advance.

  • #2
    To do this, you needs a Stata internal format monthly date variable that reflects both the month and year. With year and month as separate variables this would be, at best, complicated, if possible at all.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int year byte(month event timebetween)
    1930  1 0 10
    1930  2 0 11
    1930  3 0 12
    1930  4 0 13
    1930  5 0 14
    1930  6 0 15
    1930  7 0 16
    1930  8 1  0
    1930  9 0  1
    1930 10 0  2
    1930 11 0  3
    1930 12 0  4
    1931  1 1  0
    1931  2 1  0
    1931  3 4  0
    1931  4 0  1
    1931  5 2  0
    1931  6 0  1
    1931  7 0  2
    1931  8 0  3
    1931  9 0  4
    1931 10 0  5
    1931 11 0  6
    1931 12 0  7
    1932  1 0  8
    1932  2 2  0
    1932  3 0  1
    1932  4 1  0
    1932  5 1  0
    1932  6 0  1
    end
    
    gen mdate = ym(year, month)
    format mdate %tm
    
    rangestat (sum) wanted = event, interval(mdate -6 -1)
    -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer, and is available from SSC.

    The expression "in last six months" is ambiguous, and the example you give does not disambiguate it. In the code above, I have interpreted it to mean an interval that begins 6 months earlier and ends one month earlier than the month in the observation itself. If you meant 5 months earlier through the current month, or something else altogether, you will have to change the numbers in the -interval()- option accordingly.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Thank you Clyde. That worked like a charm. The way I was thinking about interval is exactly how you interpreted it.

      By the way, an alternative method a friend suggested is:
      Code:
       gen previousevents =  event[_n -1] + event[_n -2] + event[_n -3] + event[_n -4] + event[_n -5] + event[_n -6]
      Thanks again.

      Comment


      • #4
        #3 works if and only if

        The data are in correct sort order.

        There are no gaps.

        You don't have panel data.

        Comment

        Working...
        X