Hi STATAlist,
I am exploring survival analysis in a data set with dairy cows. For the most basic objective we want to look at the hazards associated with parity (number of calving events) with the assumption that the baseline hazard is different at each parity. Later we will investigate time varying co-efficients e.g milk production/health events for each parity level, so we will need parity as a covariate in whichever model is appropriate.
I have a data set that includes multiple records per cow, with one record for each parity. Start time is at calving and failure is either death, being sold, or calving again (though our main interest is on sold/died). To aid interpretation, I would prefer days post calving (or days in milk: DIM) as the time variable, which would reset at each calving event, as opposed to total time. An example is below, with the current dataset containing around 35,000 records from about 20,000 animals. Not all records will start at parity == 1, and there will be delayed entry for some cows (this will be ignored for now).
A previous paper on the topic took a naïve approach and simply right censored at calving event and (possibly? Was not written) controlled for correlation with frailty or random effects. Is this approach legitimate?
If this is legitimate, I would be able to write:
However, this ignores the fact that calving events are ordered. So would a multi-state set up, like below be better suited? This would, I believe, provide hazards for each calving event/parity and the hazards of removal at each parity. Where MI would be the ‘non-fatal’ calving events and D being death and sold. I am not sure how I would parameterize this for STATA and the provided R code did not include the fatal ‘D’ outcome.

I have tried a conditional risk set from 3.2.4 here: https://www.stata.com/support/faqs/s...ure-time-data/ but by stratifying on parity it obviously prevents any interpretation of parity!
Any thoughts would be very much appreciated.
Thank you
I am exploring survival analysis in a data set with dairy cows. For the most basic objective we want to look at the hazards associated with parity (number of calving events) with the assumption that the baseline hazard is different at each parity. Later we will investigate time varying co-efficients e.g milk production/health events for each parity level, so we will need parity as a covariate in whichever model is appropriate.
I have a data set that includes multiple records per cow, with one record for each parity. Start time is at calving and failure is either death, being sold, or calving again (though our main interest is on sold/died). To aid interpretation, I would prefer days post calving (or days in milk: DIM) as the time variable, which would reset at each calving event, as opposed to total time. An example is below, with the current dataset containing around 35,000 records from about 20,000 animals. Not all records will start at parity == 1, and there will be delayed entry for some cows (this will be ignored for now).
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float id byte(dairy_id parity) float(time0 dim outcome) 1 21 6 0 445 0 2 21 4 0 422 2 3 21 4 0 373 3 3 21 5 0 56 1 4 28 4 0 540 3 4 28 5 0 164 0 5 28 5 0 384 2 6 28 5 0 383 3 6 28 6 0 165 0 7 28 4 0 513 3 7 28 5 0 176 0 8 28 5 0 324 3 8 28 6 0 185 0 9 28 5 0 421 2 10 28 4 0 614 2 11 29 9 0 382 3 11 29 10 0 266 2 12 29 9 0 232 2 13 17 7 0 371 3 13 17 8 0 343 0 end label values outcome outcome_lbl label def outcome_lbl 0 "Censored", modify label def outcome_lbl 1 "Died", modify label def outcome_lbl 2 "Sold", modify label def outcome_lbl 3 "Calving", modify
A previous paper on the topic took a naïve approach and simply right censored at calving event and (possibly? Was not written) controlled for correlation with frailty or random effects. Is this approach legitimate?
Roxström, A., Ducrocq, V. & Strandberg, E. Survival analysis of longevity in dairy cattle on a lactation basis. Genet Sel Evol 35, 305 (2003). https://doi.org/10.1186/1297-9686-35-3-305
Code:
Stset stset dim, failure(outcome==1, 2) //calving censored mestreg i.parity ||dairy_id: ||id:, distribution(weibull) nolog
Ozga AK, Kieser M, Rauch G. A systematic comparison of recurrent event models for application to composite endpoints. BMC Med Res Methodol. 2018 Jan 4;18(1):2. doi: 10.1186/s12874-017-0462-x. PMID: 29301487; PMCID: PMC5755224.
Code:
stcox i.parity, efron nolog strata(parity) cluster(id)
Thank you