I am analyzing a large cancer registry (n=5000), focussing on cervical cancer. Median follow up about 9 years. Stata 11.
I performed a multivariate cox regression with overall survival as the endpoint. Variables include tumor size, age, ethnicity, histological grade, surgery type, use of radiotherapy.
When testing for the proportional-hazards assumption using Therneau and Grambsch method (with estat phtest) and lg-lg plots (with stphplot) I was surprised to discover that almost all the variables violated the assumptions.
Questions:
Is this a truly surprising result, or a known problem with large data-sets and long follow-up?
Would it be legitimate to use another regression model, e.g. Weibull distribution in this situation? or should I make almost every variable time-dependant within the Cox model?
Any other solutions? I tried limiting follow-up to 4 years, after which a few more variables fulfilled the assumption.
thank you!
Yaacov Lawrence MRCP
Dep. Radiation Oncology
Sheba Medical Center
I performed a multivariate cox regression with overall survival as the endpoint. Variables include tumor size, age, ethnicity, histological grade, surgery type, use of radiotherapy.
When testing for the proportional-hazards assumption using Therneau and Grambsch method (with estat phtest) and lg-lg plots (with stphplot) I was surprised to discover that almost all the variables violated the assumptions.
Questions:
Is this a truly surprising result, or a known problem with large data-sets and long follow-up?
Would it be legitimate to use another regression model, e.g. Weibull distribution in this situation? or should I make almost every variable time-dependant within the Cox model?
Any other solutions? I tried limiting follow-up to 4 years, after which a few more variables fulfilled the assumption.
thank you!
Yaacov Lawrence MRCP
Dep. Radiation Oncology
Sheba Medical Center
Comment