Hi all,
I am currently working on relative survival and using Paul Dickman methods. I ran stset and ran strs after that. Everything looks fine and I got the results. Now I am trying to dig in how I got to that results by doing step by step instead running "strs" i.e. work out all the varlist in strs such as n,_d, start, end , w , p , n_prime etc... However, I don't get the same as the results as I ran the strs. I compared what I got the the saved file when ran the strs, so far I got some varibles match to the saved file such as n,_d,start,end. I don't get the same value of p. I guess there must me something wrong with the way I calcuate the w (censored variable).
Here are all the code I produced the variables
local period "0/1"
local lastyearfollowup = 2011
local diagnosis_endyear = 2011
local diagnosis_startyear = 1982
stset dateofdeath, origin(dateofdiagnosis) enter(time mdy(1,1,2007)) exit(time mdy(12,31,2011)) fail(event==1) id(id) scale(365.25)
stsplit start,at("`period'") trim
keep if _st == 1
gen attage = int( age + start )
gen attyear = int( yeardiagnosis + start )
gen _year = attyear
gen _age = attage
sort _year sex _age
merge _year sex _age using QldLifeTable, nokeep nolabel keep(prob)
keep if _merge==3 /* Keep only if observations exists in both files */
drop _merge
gen end = .
forval starttime = `period' {
local endtime = `starttime' + `periodinterval'
di "start = `starttime' : end = `endtime'"
replace end = `endtime' if round(float(start),0.0001) == round(float(`starttime'),0.0001)
}
gen d = _d /* Indicator for dead in the interval */
/* Generate a censoring indicator (not the same as 'not dead') */
bysort id (start) : gen w = ( d[_N] == 0 & _n == _N & (_t-_t0) != ( end-start ) )
gen y = (_t - _t0) /* Person-time at risk in the interval */
gen d_star = -ln(prob)* y // Expected deaths (using exact times)
gen p_star = prob^(end-start) // interval-specific expected survival
bysort id (_t0) : replace p_star = prob^(end-_t0) if _n==_N //last row of each id
gen nu = ( d - d_star)/y // Estimated excess mortality rate, (d-d_star)/y
collapse (sum) d w y d_star (count) n = d (mean) p_star end, by(`bygroup' start)
gen n_prime = n - w/2 // number at risk
gen ns = n_prime - d // number surviving at end of interval
gen nu = ( d - d_star )/y // Estimated excess mortality rate, (d-d_star)/y
local w w
gen p = 1 - (d/n_prime)
/* Cumulative observed survival */
/* Updated v1.2.5 (16may2007) to correct bug when p=0 (thanks to John Condon and Arun Pokhrel) */
// p = 0 if everybody dies in the interval, therefore interval-specific survival = 0%, so ln(0) = ERROR
by sex : gen cp = exp( sum( ln(p) ) ) if p ~= 0
replace cp = 0 if p == 0
/* Cumulative expected survival */
by `bygroup' : gen cp_e2 = exp(sum(ln(p_star)))
/* Cumulative relative survival */
gen cr_e2=cp/cp_e2
gen RelSurv_`survivalperiod'_yr = cr_e2
Thanks for the help.
I am currently working on relative survival and using Paul Dickman methods. I ran stset and ran strs after that. Everything looks fine and I got the results. Now I am trying to dig in how I got to that results by doing step by step instead running "strs" i.e. work out all the varlist in strs such as n,_d, start, end , w , p , n_prime etc... However, I don't get the same as the results as I ran the strs. I compared what I got the the saved file when ran the strs, so far I got some varibles match to the saved file such as n,_d,start,end. I don't get the same value of p. I guess there must me something wrong with the way I calcuate the w (censored variable).
Here are all the code I produced the variables
local period "0/1"
local lastyearfollowup = 2011
local diagnosis_endyear = 2011
local diagnosis_startyear = 1982
stset dateofdeath, origin(dateofdiagnosis) enter(time mdy(1,1,2007)) exit(time mdy(12,31,2011)) fail(event==1) id(id) scale(365.25)
stsplit start,at("`period'") trim
keep if _st == 1
gen attage = int( age + start )
gen attyear = int( yeardiagnosis + start )
gen _year = attyear
gen _age = attage
sort _year sex _age
merge _year sex _age using QldLifeTable, nokeep nolabel keep(prob)
keep if _merge==3 /* Keep only if observations exists in both files */
drop _merge
gen end = .
forval starttime = `period' {
local endtime = `starttime' + `periodinterval'
di "start = `starttime' : end = `endtime'"
replace end = `endtime' if round(float(start),0.0001) == round(float(`starttime'),0.0001)
}
gen d = _d /* Indicator for dead in the interval */
/* Generate a censoring indicator (not the same as 'not dead') */
bysort id (start) : gen w = ( d[_N] == 0 & _n == _N & (_t-_t0) != ( end-start ) )
gen y = (_t - _t0) /* Person-time at risk in the interval */
gen d_star = -ln(prob)* y // Expected deaths (using exact times)
gen p_star = prob^(end-start) // interval-specific expected survival
bysort id (_t0) : replace p_star = prob^(end-_t0) if _n==_N //last row of each id
gen nu = ( d - d_star)/y // Estimated excess mortality rate, (d-d_star)/y
collapse (sum) d w y d_star (count) n = d (mean) p_star end, by(`bygroup' start)
gen n_prime = n - w/2 // number at risk
gen ns = n_prime - d // number surviving at end of interval
gen nu = ( d - d_star )/y // Estimated excess mortality rate, (d-d_star)/y
local w w
gen p = 1 - (d/n_prime)
/* Cumulative observed survival */
/* Updated v1.2.5 (16may2007) to correct bug when p=0 (thanks to John Condon and Arun Pokhrel) */
// p = 0 if everybody dies in the interval, therefore interval-specific survival = 0%, so ln(0) = ERROR
by sex : gen cp = exp( sum( ln(p) ) ) if p ~= 0
replace cp = 0 if p == 0
/* Cumulative expected survival */
by `bygroup' : gen cp_e2 = exp(sum(ln(p_star)))
/* Cumulative relative survival */
gen cr_e2=cp/cp_e2
gen RelSurv_`survivalperiod'_yr = cr_e2
This results obtained by running strs | |||||||||||||||||||||||||
start | end | n | d | d_star | y | p | se_p | lo_p | hi_p | p_star | r | se_r | lo_r | hi_r | cp | se_cp | lo_cp | hi_cp | cp_e2 | cr_e2 | se_cr_e2 | lo_cr_e2 | hi_cr_e2 | yeardiagnosis | nu |
0 | 1 | 18243 | 1521 | 177.5 | 8841.8 | 0.842 | 0.0037 | 0.8345 | 0.8491 | 0.98 | 0.8591 | 0.0038 | 0.8515 | 0.8664 | 0.842 | 0.0037 | 0.8345 | 0.8491 | 0.98 | 0.8591 | 0.0038 | 0.8515 | 0.8664 | 2006 | 0.151954 |
0 | 1 | 21375 | 4105 | 387.3 | 18806.9 | 0.8039 | 0.0027 | 0.7985 | 0.8092 | 0.9777 | 0.8223 | 0.0028 | 0.8167 | 0.8277 | 0.8039 | 0.0027 | 0.7985 | 0.8092 | 0.9777 | 0.8223 | 0.0028 | 0.8167 | 0.8277 | 2007 | 0.197677 |
0 | 1 | 22433 | 4210 | 410.9 | 19799.8 | 0.8085 | 0.0026 | 0.8032 | 0.8136 | 0.9775 | 0.8271 | 0.0027 | 0.8217 | 0.8323 | 0.8085 | 0.0026 | 0.8032 | 0.8136 | 0.9775 | 0.8271 | 0.0027 | 0.8217 | 0.8323 | 2008 | 0.191876 |
0 | 1 | 23036 | 4258 | 409.4 | 20368.4 | 0.8114 | 0.0026 | 0.8062 | 0.8164 | 0.9782 | 0.8295 | 0.0027 | 0.8242 | 0.8346 | 0.8114 | 0.0026 | 0.8062 | 0.8164 | 0.9782 | 0.8295 | 0.0027 | 0.8242 | 0.8346 | 2009 | 0.18895 |
0 | 1 | 23256 | 4311 | 411.3 | 20579.3 | 0.811 | 0.0026 | 0.8059 | 0.816 | 0.9782 | 0.8291 | 0.0026 | 0.8238 | 0.8342 | 0.811 | 0.0026 | 0.8059 | 0.816 | 0.9782 | 0.8291 | 0.0026 | 0.8238 | 0.8342 | 2010 | 0.189498 |
0 | 1 | 23551 | 2575 | 219.6 | 10623.3 | 0.7847 | 0.0037 | 0.7773 | 0.792 | 0.9781 | 0.8023 | 0.0038 | 0.7947 | 0.8097 | 0.7847 | 0.0037 | 0.7773 | 0.792 | 0.9781 | 0.8023 | 0.0038 | 0.7947 | 0.8097 | 2011 | 0.22172 |
This results obtained by work out each varlist (step by step) | |||||||||||||||||||||||||
start | d | w | y | d_star | n | p_star | end | n_prime | ns | nu | p | cp | cp_e2 | cr_e2 | RelSurv_1_yr | yeardiagnosis | |||||||||
0 | 1521 | 16722 | 8841.756 | 177.4596 | 18243 | 0.989358 | 1 | 9882 | 8361 | 0.151954 | 0.846084 | 0.846084 | 0.989358 | 0.855185 | 0.855185 | 2006 | |||||||||
0 | 4105 | 0 | 18806.93 | 387.3008 | 21375 | 0.977664 | 1 | 21375 | 17270 | 0.197677 | 0.807953 | 0.807953 | 0.977664 | 0.826412 | 0.826412 | 2007 | |||||||||
0 | 4210 | 0 | 19799.8 | 410.8893 | 22433 | 0.977459 | 1 | 22433 | 18223 | 0.191876 | 0.81233 | 0.81233 | 0.977459 | 0.831063 | 0.831063 | 2008 | |||||||||
0 | 4258 | 0 | 20368.43 | 409.3769 | 23036 | 0.978157 | 1 | 23036 | 18778 | 0.18895 | 0.815159 | 0.815159 | 0.978157 | 0.833362 | 0.833362 | 2009 | |||||||||
0 | 4311 | 27 | 20579.27 | 411.2643 | 23256 | 0.97819 | 1 | 23242.5 | 18931.5 | 0.189498 | 0.814521 | 0.814521 | 0.97819 | 0.832682 | 0.832682 | 2010 | |||||||||
0 | 2575 | 20976 | 10623.32 | 219.5979 | 23551 | 0.978126 | 1 | 13063 | 10488 | 0.22172 | 0.802878 | 0.802878 | 0.978126 | 0.820833 | 0.820833 | 2011 | |||||||||
Thanks for the help.
Comment