Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to keep consecutive observations

    Hi there, I am new to Stata and hope to seek advice from you. My dataset looks like below. For ID1, it has three consecutive epi=1, so does ID2. ID3 has 4 consecutive epi=1. ID4 in total has 4 epi=1 but they are interrupted by 0, ie the four 1s are not consecutively appeared. I want to keep ID 1, 2, 3 as they have consecutive at least three epi=1. Could you please help me out of this? what should be the stata codes to deal with my needs? NB: I need to keep all the observations, e.g. ID has two epi=0 and three epi=3, in the end, I need to keep this five ID1 not only the three epi=1

    ID epi
    1 0
    1 0
    1 1
    1 1
    1 1
    2 0
    2 0
    2 0
    2 0
    2 0
    2 0
    2 0
    2 1
    2 1
    2 1
    3 0
    3 0
    3 1
    3 1
    3 1
    3 1
    4 0
    4 1
    4 1
    4 0
    4 0
    4 0
    4 1
    4 1

    What I expect for the cleaned dataset is like below:

    ID epi
    1 0
    1 0
    1 1
    1 1
    1 1
    2 0
    2 0
    2 0
    2 0
    2 0
    2 0
    2 0
    2 1
    2 1
    2 1
    3 0
    3 0
    3 1
    3 1
    3 1
    3 1

    Thank you very much!



  • #2
    The following code will give the results you indicated for your example data:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(id epi)
    1 0
    1 0
    1 1
    1 1
    1 1
    2 0
    2 0
    2 0
    2 0
    2 0
    2 0
    2 0
    2 1
    2 1
    2 1
    3 0
    3 0
    3 1
    3 1
    3 1
    3 1
    4 0
    4 1
    4 1
    4 0
    4 0
    4 0
    4 1
    4 1
    end
    
    //    MARK PRE-EXISTING SORT ORDER
    gen `c(obs_t)' obs_no = _n
    
    //    IDENTIFY RUNS OF EPI
    by id (obs_no), sort: gen run_num = sum(epi != epi[_n-1])
    
    //    MARK LENGTH OF EACH RUN
    by id run_num (obs_no), sort: gen run_length = _N
    
    //    IDENTIFY ID'S TO BE RETAINED
    by id (run_num obs_no): egen keeper = max(run_length >= 3 & epi == 1)
    keep if keeper
    It may or may not work properly in your full data set. That is because your problem is incompletely specified. You do not say what to do with an id whose observations include both a run of 3 or more consecutive observations with epi = 1 but also contains other, shorter runs of epi = 1. The above code will retain such ids: if the id has any run of 3 or more observations with epi = 1, the id is kept regardless of what else it may have.

    If that's not what you want, please post back showing an example where the code does not perform as needed, and explain the general rule of how to handle the kind of situation I mentioned in the preceding paragraph.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have done here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Finally, it is the norm in this community that we use our real given and surnames as our user name, to promote collegiality and professionalism. The Forum software will not permit you to edit your username, but you can click on Contact Us (lower right hand corner of this page) and message the system administrator requesting he make the change for you. Thank you.

    Comment


    • #3
      Dear ProfSchechter,
      I really appreciate your detailed reply. That works ! Thank you for your kind reminder, I have already contacted the admin staff to change my username into my real name. Apologies for the inconvenience.
      Qian

      Comment

      Working...
      X