Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • annual incidence rates using survival data

    Hi stata forum,

    I have a wide dataset. For each individual (all of whom have end-stage renal disease), I have hospitalization data available (event = yes/no, date of event).

    I am interested in the rate of hospitalizations for amputations in this population over time (2000-2014), i.e. I want to compare incidence rates across years, adjusted/standardized for age.

    I want to set the data as survival data.


    stset dox, fail(amputation==1) origin(born) entry(first_se) scale(365.25) id(id)


    where ‘dox’ (date of exit) is either end of follow-up time (31dec2014), date of first amputation, or date of death, whichever occurred first.

    And ‘entry’ is the date they first entered my registry database (of people with known end-stage renal disease).

    I am not sure what is the best approach moving forward to get incidence rates, per year, standardized by age to a reference population (using direct standardization method).

    I want my end result to be incidence rates with 95%CI, reported per year. I don’t want to report Hazard ratios, or incidence rate ratios, or coefficients.

    I am a bit lost as to which stata command to use as stplit, poisson, sptime all seem to not be able to do exactly what I want.

    Any advice would be much appreciated.
    Jess

  • #2
    The command you want is -dstdize-, but it does not use data that has been -stset-, and you may have to do some data management to get your data into the form you need for this command. See -help dstdie- and look at the worked examples in the PDF documentation.

    Comment


    • #3
      Hi clyde,

      Thanks for your quick response. Is it possible to do something like this (instead of dstdize)?


      stset dox1, fail(amputation==1) origin(born) entry(first_se) scale(365.25) id(usrds_id)
      gen _y = _t - _t0
      stsplit _year, after(time=d(1/1/2000)) at(0(1)14) trim
      replace _year=2000 + _year
      poisson _d i._year _t, exp(_y)
      margins _year, predict(ir)

      Comment


      • #4
        Yes, that will give you regression-adjusted incidence rates, which I think are probably the most commonly used approach, and the approach I generally prefer and use in my own work.

        But you said in #1
        I am not sure what is the best approach moving forward to get incidence rates, per year, standardized by age to a reference population (using direct standardization method).
        and these results will not be that (unless the reference population age distribution is precisely the age distribution in your sample data).

        Comment


        • #5
          my apologies clyde. age-adjusted incidence rates is ok (rather than direct standardization).
          I am now having some difficulty getting the margins command to work.

          I have used the following commands.

          stset dox1, fail(lea1==1) origin(born) entry (entry) scale(365.25) id(usrds_id)
          stsplit _year, after(time=d(1/1/2000)) at(0(1)14) trim
          replace _year=2000 + _year
          gen _y=_t-_t0

          poisson lea1 i._year _t0, exp(_y)

          What I want:
          Incidence rates of amputations, per 1,000 person-years, per calendar year, stratified by diabetes status(dm), with 95%CIs
          Ideally, I want a table that has the following data in it:

            Diabetes No diabetes
          Year N (events) Person-years (PY) Rate (95%CI) (per 1,000 PY) N (events) Person-years (PY) Rate (95%CI) (per 1,000 PY)
          2000            
          2001            
          2002            

          How do i achieve this using the poisson and margin codes?

          Two additional questions:
          1. Do i estimate _y AFTER i split the data? or before?
          2. My dataset had approx. 2 million observations...which then increased to 10 million after i aplit by year. This then made themargin code very slow to run. Any solutions for this?


          Many thanks in advance
          Jess



          Comment


          • #6
            There is no single command that will create a table like that in one step. But

            Code:
            margins year, predict(n)
            will get you the N columns

            Code:
            margins year, predict(ir)
            will get you the incidence rates.

            Person-years will not come out of margins. You'll just have to add those up. Something like:

            Code:
            collapse (count) person_id, by(year)
            As for disaggregating by diabetes, the simplest way, if you are not going to be testing contrasts between diabetics and non-diabetics, is to just do two separate regressions on different samples. Alternatively, you can do a single regression on the whole sample, with a diabetes # year interaction term:

            Code:
            poisson lea1 i._year##i.diabetes _t0, exp(_y)
            margins year#diabetes, predict(n)
            margins year#diabetes, predict(ir)
            collapse (count) person_id by(diabetes year)
            Calculating _y after splitting the data makes sense.

            I don't know any reasonable way to speed up -margins-. It's a computationally intensive process. You can save a little time by, in effect, writing your own -margins- program, but that's a real pain and the time you spend coding it and debugging it will probably outweigh the savings in run time (which will be only modest anyway).

            Comment


            • #7
              Withdrawn to check calculation
              Last edited by Steve Samuels; 15 Mar 2018, 11:36.
              Steve Samuels
              Statistical Consulting
              [email protected]

              Stata 14.2

              Comment


              • #8
                To finish this topic, here is a way of calculating person-years, illustrated with the breast cancer data set of 686 women. Only a fraction of a year is contributed if a woman exits during that year. Person-years are totaled for those women who did and did not receive hormone therapy. This version works for single entry data. I think that I've got it right this time.

                Code:
                sysuse brcancer, clear
                gen ty = rectime/365.25 /* convert time to year scale */
                rename censrec event
                
                stset ty, failure(event) id(id)
                
                /* stsplit to one year intervals */
                stsplit tyear , every(1)
                replace tyear=tyear+1      //start numbering tyear from 1
                label var tyear "Whole Year of followup after split"
                
                replace event=0 if event==.
                sort id ty
                
                /* calculate person years in each year of followup */
                
                gen pyears = 1 if  ty==tyear
                replace pyears = ty -(tyear-1) if tyear>ty
                
                table  hormon, c(sum pyears) row
                with result
                Code:
                -----------------------
                hormonal  |
                therapy   | sum(pyears)
                ----------+------------
                        0 |    1276.608
                        1 |     835.3703
                          |
                    Total |    2111.978
                -----------------------
                Last edited by Steve Samuels; 15 Mar 2018, 12:54.
                Steve Samuels
                Statistical Consulting
                [email protected]

                Stata 14.2

                Comment


                • #9
                  The Manual entry for stsplit shows simpler code for getting person years:
                  Code:
                  stsplit tyear, every(one)
                  gen pyears= _t-_t0
                  stptime also calculates person-time and incidence rates, and can estimate SMRs with a reference population and jackknifed standard errors.

                  Steve Samuels
                  Statistical Consulting
                  [email protected]

                  Stata 14.2

                  Comment

                  Working...
                  X