I was first using a test data set, now using a dummy dataset that clearly represents my research data
//My question, why doesn't for eg id = 2, observation 4 take the value nextyr = 3,
the same can be said for observation 7, id = 19 , I would have expected observation 6 for nextyr for id=19 to be 1?
Why is this not happening?
//// Just fyi... this is my plan for the rest of the data....
drop if missing(nextyr)
generate f = 1
///it calculates the count of transitions from (status) to nextyr (new transition) within each year.
collapse (sum) f, by(year status nextyr)
///This calculates the total count of transitions from each status within each year.
bysort year status: egen all = total(f)
//It divides the count of transitions (f) by the total count of transitions from the same starting state (all)
//this gives the proportion of transitions to each possible next state, conditional on the current state and year.
generate p = f/all
// review intermediate output
//formatting to 3 decimal places (9 characters)
format %9.3f p
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float id str7 event float(treatment dead revised year op) long status float nextyr 1 "op" 1 1 1 2001 1 2 . 1 "revised" 1 1 1 2004 1 3 1 1 "death" 1 1 1 2005 1 1 . 2 "op" 0 0 1 2001 1 2 . 2 "revised" 0 0 1 2007 1 3 . 19 "op" 0 1 0 2008 1 2 . 19 "death" 0 1 0 2016 1 1 . 45 "op" 0 0 1 2005 1 2 . 45 "revised" 0 0 1 2008 1 3 . 46 "op" 1 1 0 2007 1 2 . 46 "death" 1 1 0 2020 1 1 . 54 "op" 1 0 0 2001 1 2 . 76 "op" 1 1 0 2009 1 2 . 76 "death" 1 1 0 2015 1 1 . 89 "op" 1 1 0 2006 1 2 . 89 "death" 1 1 0 2010 1 1 . end format %ty year label values treatment q1 label def q1 0 "control", modify label def q1 1 "treatment", modify label values dead q2 label def q2 0 "alive", modify label def q2 1 "dead", modify label values revised q3 label def q3 0 "success", modify label def q3 1 "revised", modify label values status status label def status 1 "death", modify label def status 2 "op", modify label def status 3 "revised", modify
Code:
//// start transition probabilities
// create a datset of probabilities using the example data
//declares data panel data
xtset id year, yearly
//takes the value of status in the following row -- this, as you can see from the data provided in dataex, only works for observation 2 , id = 1.
generate nextyr=f.status
the same can be said for observation 7, id = 19 , I would have expected observation 6 for nextyr for id=19 to be 1?
Why is this not happening?
//// Just fyi... this is my plan for the rest of the data....
drop if missing(nextyr)
generate f = 1
///it calculates the count of transitions from (status) to nextyr (new transition) within each year.
collapse (sum) f, by(year status nextyr)
///This calculates the total count of transitions from each status within each year.
bysort year status: egen all = total(f)
//It divides the count of transitions (f) by the total count of transitions from the same starting state (all)
//this gives the proportion of transitions to each possible next state, conditional on the current state and year.
generate p = f/all
// review intermediate output
//formatting to 3 decimal places (9 characters)
format %9.3f p
Comment