Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Incidence across calendar years

    I have a dataset where I want to calculate incidence by age groups across the calender years
    The command I used was:

    stset censor, failure(event==1) exit(end) enter(start) id(id) scale(365.25) origin(dob)
    stsplit agegroup, at(0(10) 80)

    preserve
    keep if agecat==0
    stset censor, failure(event==1) exit(end enter(start) id(id) scale(365.25)
    stsplit calyear, at(40(1)60)
    gen calendar_yr= calyear+1960
    strate calendar_yr, per (100000)
    restore


    preserve
    keep if agecat==10
    stset censor, failure(event==1) exit(end) enter(start) id(id) scale(365.25)
    stsplit calyear, at(40(1)60)
    gen calendar_yr= calyear+1960
    strate calendar_yr, per (100000)
    restore


    I repeated this for the rest of the age groups.

    These codes increased the person-years and gave me wrong incidence.

    How do I resolve this? I have attached an example dataset


    id event censor start end dob
    12. 1. 12/12/2011. 01/01/2010 01/01/2020 07/08/1993
    ​​​​​​13. 1. 12/12/2014. 07/02/2011 02/02/2019. 01/03/2005
    ​​​​​​14. 0. 12/04/2011. 06/03/2010 03/03/2018. 02/07/2002
    ​​​​​​15. 1. 12/08/2012. 05/04/2012 06/04/2017. 03/03/1990
    16. 1. 12/12/2011. 04/05/2013 07/05/2016. 04/03/2003
    ​​​​​​17. 0. 12/09/2013. 03/06/2014 08/06/2016. 05/04/1996
    ​​​​​​18. 1. 12/12/2011. 01/07/2015 09/07/2019. 06/05/2002
    ​​​​​​19. 0. 12/12/2011. 08/01/2010 11/08/2020 07/06/2001

    Thank you very much

  • #2
    If you present the example data so it can be easily read into Stata (e.g., using -dataex-) then I would be willing to have a look at it.

    Also, can you explain the content of the variables censor and end.

    Is it possible you've misunderstood the syntax of stset? have a look at the example with the diet data in the help (Example: Single-record-per-subject with censoring). If you gave me these data without any explanation then, based on variable names, I would use

    Code:
    stset end, failure(event==1) enter(start) id(id) scale(365.25) origin(dob)
    Consider your first stset

    Code:
    stset censor, failure(event==1) exit(end) enter(start) id(id) scale(365.25) origin(dob)
    Observations contribute person-time between start and censor. For some observations in the example data, start is after censor so those observations will not contribute to the analysis.

    Since, in the example data, end > censor for all observations the exit(end) option will have no impact.

    Also, you should be able to do this (splitting on multiple timescales) without subsetting on age group (i.e., there's no need to preserve/restore).

    You're much more likely to get a response if you provide example data that can be easily read into Stata.

    Comment

    Working...
    X