I have a dataset with the number of events (numerator) and population (denominator) by age and year. My aim is to calculate a growth rate for each age over the years with how much uncertainty there is. The code in Stata I'm using is:
This gives output (showing just age 30-34 for i.age#c.year for brevity):
The problem with this is that, whilst literally correct, it doesn't take account of the fact that in the underlying distribution there is likely to be significant association between the slope over years between age=n and age=n+1, etc. As the regression doesn't know this, the standard errors are very wide and most t values <2, even though if I look at the total growth in the rate over time it is strongly significant.
Can anyone advise how I can improve on this approach please? I don't want to impose a functional form on the relationship between the rate and age, and using things like grouped ages or splines only seem like partial solutions.
Code:
g lnrate=ln(events/population) reg lnrate i.age i.age#c.year [aw=population],base
Code:
age Coef. Std. Err. t P>t [95% Conf. Interval] 30 -0.0005 0.0126 -0.04 0.969 -0.0252 0.0242 31 -0.0006 0.0126 -0.05 0.959 -0.0255 0.0242 32 0.0117 0.0126 0.93 0.353 -0.0130 0.0365 33 0.0031 0.0126 0.24 0.807 -0.0217 0.0278 34 -0.0113 0.0127 -0.89 0.373 -0.0363 0.0136
Can anyone advise how I can improve on this approach please? I don't want to impose a functional form on the relationship between the rate and age, and using things like grouped ages or splines only seem like partial solutions.
Comment