Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating growth rates over time by age with underlying interdependence between ages

    I have a dataset with the number of events (numerator) and population (denominator) by age and year. My aim is to calculate a growth rate for each age over the years with how much uncertainty there is. The code in Stata I'm using is:

    Code:
    g lnrate=ln(events/population)
    reg lnrate i.age i.age#c.year [aw=population],base
    This gives output (showing just age 30-34 for i.age#c.year for brevity):

    Code:
    age    Coef.    Std. Err.    t      P>t      [95% Conf.    Interval]
    30    -0.0005    0.0126    -0.04    0.969    -0.0252    0.0242
    31    -0.0006    0.0126    -0.05    0.959    -0.0255    0.0242
    32     0.0117    0.0126     0.93    0.353    -0.0130    0.0365
    33     0.0031    0.0126     0.24    0.807    -0.0217    0.0278
    34    -0.0113    0.0127    -0.89    0.373    -0.0363    0.0136
    The problem with this is that, whilst literally correct, it doesn't take account of the fact that in the underlying distribution there is likely to be significant association between the slope over years between age=n and age=n+1, etc. As the regression doesn't know this, the standard errors are very wide and most t values <2, even though if I look at the total growth in the rate over time it is strongly significant.

    Can anyone advise how I can improve on this approach please? I don't want to impose a functional form on the relationship between the rate and age, and using things like grouped ages or splines only seem like partial solutions.

  • #2
    should probably use poisson with an exposure(population) and robust or clustered errors.

    if events = 0 (or if it could, theoretically), then your DV is undefined and you're losing an important part of your data.

    you could test the model down (I suspect many of the growth rates are identical; all those will be given the huge SE (low t).

    or include year as a regressor, then the age*year variable is a direct test of differences from the base group.

    Comment


    • #3
      Thanks. Using poisson would be good practice, even though I don't have any cases where the rate is zero.

      What did you mean by "you could test the model down"?

      And good idea about including year and then the interaction just capturing the deviation.

      Thanks

      Comment


      • #4
        When the coefficients aren't different, you can group them. That cuts down on the number of the coefficients. It's a bit of a pre-test, so not ideal, but you see it done.

        Comment

        Working...
        X