dataset:
Question:
As you can see I have gaps in the years except for ID =1, where between 2004 and 2005 - these are consecutive years.
Therefore for each ID, I have told stata (seen in bold section) to take the last value of variable -state- and insert it at the last date for every id, therefore substituting the missing values at next= .
However, is this the right way to go about it? As I am calculating the transition probabilities in a markov model, therefore, this would mean stata would calculate the transition probability of moving from
id =1 state =3 at 2004, to id=1 state=3 at 2005 , which doesn't make sense as the individual remained in the same state, in this case dead. How could I address this?
I've already supplied my code, on how to calculate transition probabilities in this thread here:
https://www.statalist.org/forums/for...n-var-f-status
Code:
lear input float id str7 event float(treatment dead revised year op) 1 "op" 1 1 1 2001 1 1 "revised" 1 1 1 2004 1 1 "death" 1 1 1 2005 1 2 "op" 0 0 1 2001 1 2 "revised" 0 0 1 2007 1 19 "op" 0 1 0 2008 1 19 "death" 0 1 0 2016 1 45 "op" 0 0 1 2005 1 45 "revised" 0 0 1 2008 1 46 "op" 1 1 0 2007 1 46 "death" 1 1 0 2020 1 54 "op" 1 0 0 2001 1 76 "op" 1 1 0 2009 1 76 "death" 1 1 0 2015 1 89 "op" 1 1 0 2006 1 89 "death" 1 1 0 2010 1 end label values treatment q1 label def q1 0 "control", modify label def q1 1 "treatment", modify label values dead q2 label def q2 0 "alive", modify label def q2 1 "dead", modify label values revised q3 label def q3 0 "success", modify label def q3 1 "revised", modify encode event, gen(status) gen state= . replace state= 1 if status == 2 replace state = 2 if status == 3 replace state = 3 if status == 1 keep if treatment == 1 drop event dead revised op
As you can see I have gaps in the years except for ID =1, where between 2004 and 2005 - these are consecutive years.
Therefore for each ID, I have told stata (seen in bold section) to take the last value of variable -state- and insert it at the last date for every id, therefore substituting the missing values at next= .
However, is this the right way to go about it? As I am calculating the transition probabilities in a markov model, therefore, this would mean stata would calculate the transition probability of moving from
id =1 state =3 at 2004, to id=1 state=3 at 2005 , which doesn't make sense as the individual remained in the same state, in this case dead. How could I address this?
I've already supplied my code, on how to calculate transition probabilities in this thread here:
https://www.statalist.org/forums/for...n-var-f-status
Code:
//// start transition probabilities sort id year // create a datset of probabilities using the example data //decalres data panel data xtset id year, yearly ////gaps in the years gen long obs_no = _n by id (obs_no), sort: gen next = state[_n+1] ******* to CLARIFY HERE **** ///replace the missing with the max state bysort id (state) : gen max2 = state[_N] replace next = max2 if next == . // drop variables drop max2 obs_no