Hi,
I'm looking for a little help analyzing data from a case-cohort study. For reference, a case-cohort study is not a case-control study and not quite a normal cohort study; it is a special study design of its own, but I don't want to use up a lot of space describing it here. A good article about it is Barlow (here http://www.ncbi.nlm.nih.gov/pubmed/10580779 ) except note that the methods there don't quite apply to my data, as described below.
There are a few tools in another statistical package but fewer for this design in Stata, which I prefer.
In some detail, I have age-stratified case-cohort data that includes about 1000 sub-cohort members and 500 cases. The sub-cohort is an age-stratified sample of the larger cohort of about 11,000 people. This falls under the "confounder stratified" design described in Langholz and Jiao, Computational Statistics and Data Analysis, 51:3737-3748 (2007).
[http://www.sciencedirect.com/science...7947306005068]
(The stratified design makes the Barlow article not quite fit.)
I know that there are user-made packages called STCASCOH and STSELPRE for case-cohort data.
So I was wondering:
1. Is there anything more recent for analysis of case-cohort data?
2. It seems like STCASCOH and STSELPRE are a bit limited and do not handle, e.g, robust vce or stratified data. Is that true?
3. On the other hand just using STCOX stratified on the age group with robust vce and an offset seems to get me very close to the examples results in Langholz and Jiao (Table 2(B) with the very helpful supplied model data with that paper's supplementary material). But, I am worried that (as an epidemiologist and not a statistician) I am missing something, and will be doing bad things when I use this on my real dataset. The model dataset is small and the difference between some of the example results are pretty small, so hard to say if I am just close by luck. However, STSELPRE does not produce the published results with any set of options I have found.
Does anyone know what would be a correct way to analyze these data with Stata? Can you help me justify the use of STCOX with the options correctly specified based on the method and theory laid out by Langholz & Jiao 2007?
I am happy to share even more details of how close I get to L&J 2007 as needed, and etc., if someone might have an idea what is going on.
Thanks,
Scott
I'm looking for a little help analyzing data from a case-cohort study. For reference, a case-cohort study is not a case-control study and not quite a normal cohort study; it is a special study design of its own, but I don't want to use up a lot of space describing it here. A good article about it is Barlow (here http://www.ncbi.nlm.nih.gov/pubmed/10580779 ) except note that the methods there don't quite apply to my data, as described below.
There are a few tools in another statistical package but fewer for this design in Stata, which I prefer.
In some detail, I have age-stratified case-cohort data that includes about 1000 sub-cohort members and 500 cases. The sub-cohort is an age-stratified sample of the larger cohort of about 11,000 people. This falls under the "confounder stratified" design described in Langholz and Jiao, Computational Statistics and Data Analysis, 51:3737-3748 (2007).
[http://www.sciencedirect.com/science...7947306005068]
(The stratified design makes the Barlow article not quite fit.)
I know that there are user-made packages called STCASCOH and STSELPRE for case-cohort data.
So I was wondering:
1. Is there anything more recent for analysis of case-cohort data?
2. It seems like STCASCOH and STSELPRE are a bit limited and do not handle, e.g, robust vce or stratified data. Is that true?
3. On the other hand just using STCOX stratified on the age group with robust vce and an offset seems to get me very close to the examples results in Langholz and Jiao (Table 2(B) with the very helpful supplied model data with that paper's supplementary material). But, I am worried that (as an epidemiologist and not a statistician) I am missing something, and will be doing bad things when I use this on my real dataset. The model dataset is small and the difference between some of the example results are pretty small, so hard to say if I am just close by luck. However, STSELPRE does not produce the published results with any set of options I have found.
Does anyone know what would be a correct way to analyze these data with Stata? Can you help me justify the use of STCOX with the options correctly specified based on the method and theory laid out by Langholz & Jiao 2007?
I am happy to share even more details of how close I get to L&J 2007 as needed, and etc., if someone might have an idea what is going on.
Thanks,
Scott
Comment