Hi Clyde Schechter. I was going to use 'seq' as my analysis time variable, however, after -stset-ing, Stata identified 732 missings for seq.
The problem appears to occur whenever, 'in_relationship' == 0, which in turn causes both 'spell_num' & 'seq' == . and is caused when at least one in a couple select a category other than 'de facto' or 'married' as their marital status, but the most significant issue appears to be when BOTH do not answer the marital status question, which appears to account for 602 of the 732 instances of missing.
Two key points:
- couple (50 51) answered de facto, but in wave 10, id-50 states "divorced" - this triggers the code to identify the 'end' of a 'spell', which is not the case - this is an issue as I use 'end' as the 'failure' variable ('seq' continues to count the sequence correctly).
- couple (12 24) didn't answer the marital status question. Given this group accounts for the largest share of the issue, do you see an issue if I remove them from my analysis?
Given these isues, do you think 'seq' is suitable as the analysis time variable (stset seq,) or would 'wave' or 'couple wave' (where couple is group(id p_id)) be better?
Your thoughts are appreciated.
Code:
id: couple failure event: end == 1 obs. time interval: (seq[_n-1], seq] enter on or after: begin==1 exit on or before: time . ------------------------------------------------------------------------------ 82,438 total observations 732 event time missing (seq>=.) PROBABLE ERROR 8,249 observations end on or before enter() ------------------------------------------------------------------------------ 73,457 observations remaining, representing 8,233 subjects 619 failures in multiple-failure-per-subject data 73,597 total analysis time at risk and under observation at risk from t = 0 earliest observed entry t = 1 last observed exit t = 18
Two key points:
- couple (50 51) answered de facto, but in wave 10, id-50 states "divorced" - this triggers the code to identify the 'end' of a 'spell', which is not the case - this is an issue as I use 'end' as the 'failure' variable ('seq' continues to count the sequence correctly).
- couple (12 24) didn't answer the marital status question. Given this group accounts for the largest share of the issue, do you see an issue if I remove them from my analysis?
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long(id p_id) byte(wave in_relationship) int spell_num byte seq float length byte(begin end mrcurr1 mrcurr2) 17 18 5 1 1 5 6 0 0 . 1 17 18 6 1 1 6 6 0 1 . 1 17 18 7 0 . . 6 1 0 . 3 12 24 1 1 1 1 2 1 0 1 1 12 24 4 1 1 2 2 0 1 1 1 12 24 6 0 . . 2 1 0 . . 12 24 7 0 . . 2 0 0 . . 12 24 8 0 . . 2 0 0 . . 12 24 9 0 . . 2 0 0 . . 12 24 10 0 . . 2 0 0 . . 12 24 11 0 . . 2 0 0 . . 12 24 12 0 . . 2 0 0 . . 50 51 6 1 1 6 11 0 0 2 2 50 51 7 1 1 7 11 0 0 . 2 50 51 8 1 1 8 11 0 1 . 2 50 51 10 0 . . 11 1 0 4 2 50 51 11 1 3 10 11 1 0 2 2 50 51 12 1 3 11 11 0 0 2 4 50 51 13 1 3 12 11 0 0 2 3 end
Code:
bys couple (wave): gen byte cwave = _n
Comment