Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Managing overlapping dates

    Dear all,

    I am working a study on sick leave in relation to an interested outcome. However, I am struggling in calculation of sick leaves due to overlapping dates. Some individuals have overlapping sick leaves for different diagnosis (received from different registers). For example, id=1 and id= 2 have overlapping leaves for different diagnosis. I would like to calculate total sick leaves regardless of diagnosis (total_days) and overlapping days (overlap). I tried manually for total_days and overlap in the example dataset but it will not be ok for the large dataset with >100,000 observations. So I would like to request your help. The example data set are as follow;

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte id int(from to) byte diagnosis int(total_days overlap)
    1 14976 15705 1 6691 2915
    1 16437 17531 1 6691 2915
    1 17898 18627 1 6691 2915
    1 19359 21184 1 6691 2915
    1 21550 22400 1 6691 2915
    1 15341 17531 2 6691 2915
    1 18993 20088 2 6691 2915
    1 20820 21914 2 6691 2915
    2 14976 18262 1 6327 1459
    2 20089 21914 1 6327 1459
    2 17532 18992 2 6327 1459
    2 21185 22400 2 6327 1459
    3 14976 16436 1 5841    0
    3 18263 21914 1 5841    0
    3 17167 17897 2 5841    0
    end
    format %tdnn/dd/CCYY from
    format %tdnn/dd/CCYY to



  • #2
    See https://www.stata-journal.com/articl...article=dm0068

    dm0068 is revealed as an otherwise unpredictable search term for discussions here.

    Comment


    • #3
      Thanks Nick. After reading your article (dm0068) and Clyde's suggestions from http://www.statalist.org/forums/foru...es-of-all-rows , the following codes help to solve the issue. I am looking forward the comments and other suggested solutions.

      Code:
      gen long obs=_n
      expand 2
      by obs, sort: gen time=cond(_n==1, from, to)
      by obs, sort: gen fromto=cond(_n==1, 1, -1)
      by id (time),sort: gen present=sum(fromto)
      list, separator(0)
      
      by id (time),sort: gen spell=sum((present>0) & (present[_n-1]==0|_n==1))
      
      egen flag=tag(id spell)
      list, separator(0)
      *calculating the total leaves
      by id spell (time), sort: gen duration=time[_N]-time[1]
      by id, sort:egen total_days2 = total(cond(flag,duration,.))
      list, separator(0)
      
      *calculating the overlapping leaves
      by id spell (time),sort: egen overlap_in_spell=total(cond(present>1,time[_n+1]-time,.))
      by id, sort: egen overlap2=total(cond(flag, overlap_in_spell,.))
      order overlap_in_spell, after(duration)
      list, separator(0)

      Comment

      Working...
      X