Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping observations after first occurrence of an event

    I need some help finding a solution for the following problem:
    I want to drop all the observations that occur after the first time an event occurs, no matter if it happens again or not.

    The following table is an example of the problems I'm facing:
    id year event
    1 2007 0
    1 2008 0
    1 2009 1
    1 2010 1
    1 2011 1
    1 2012 1
    2 2007 0
    2 2008 0
    2 2009 1
    2 2010 0
    2 2011 0
    2 2012 1
    2 2013 0
    3 2007 0
    3 2008 1
    3 2009 0
    3 2010 0
    3 2011 0
    3 2012 0
    3 2013 1
    What I want to do is to drop the following observations: for id 1, drop observations for years 2010, 2011 and 2012, as they come after the first event occurrence, which takes place in 2009. For the same reason, for id 2, drop observations for years 2010, 2011, 2012 and 2013, and for id 3, drop observations for years 2009, 2010, 2011, 2012 and 2013.

    What I was trying to do was to sum the events, and drop if the sum was bigger than 1, as such:

    by id (year), sort: gen byte sum = sum(event)
    drop if sum>1

    This works for id 1, but doesn't work for ids 2 and 3, because after the first event, the sum remains 1 for some observations after, since the event doesn't occur in the year following the one in which the event first occurs.

    I can't seem to find a way to solve this. Any help would be much much appreciated!

  • #2
    Code:
    clear 
    input id    year    event
    1    2007    0
    1    2008    0
    1    2009    1
    1    2010    1
    1    2011    1
    1    2012    1
    2    2007    0
    2    2008    0
    2    2009    1
    2    2010    0
    2    2011    0
    2    2012    1
    2    2013    0
    3    2007    0
    3    2008    1
    3    2009    0
    3    2010    0
    3    2011    0
    3    2012    0
    3    2013    1
    end 
    
    by id (year), sort: keep if sum(event) <= 1 & sum(event[_n-1]) == 0 
    
    list, sepby(id) 
    
         +-------------------+ 
         | id   year   event |
         |-------------------|
      1. |  1   2007       0 |
      2. |  1   2008       0 |
      3. |  1   2009       1 |
         |-------------------|
      4. |  2   2007       0 |
      5. |  2   2008       0 |
      6. |  2   2009       1 |
         |-------------------|
      7. |  3   2007       0 |
      8. |  3   2008       1 |
         +-------------------+
    Another way to do it:


    Code:
     
    egen first = min(cond(event == 1, year, .)), by(id)      
    keep if year <= first

    Comment


    • #3
      Thank you very much! This solved it!

      Comment

      Working...
      X