Dear Statalisters,
I am a running a hierarchical age-period-cohort model (see Yang & Land 2006) in Stata 15.1. I wish to graph the predicted probabilities of my (binary) outcome of interest for each cohort and period. My understanding is that this can only be achieved at present by manually calculating predicted probabilities from the random effects. Below, I detail how I have done this and ask Statalist whether my approach seems to be correct. If correct, I am minded to write a short program to automate the procedure.
The full dataset I use combines several cross-sectional surveys for a given country conducted at different yearly intervals and has > 7,000 observations. Below, I post a snippet data example (h/t the dataex command). Here, age records the age of the respondent at the time of the survey, period is the year in which the survey was conducted, and cohort is calculated as period of survey minus age, and then recoded into five year brackets.
A summary of the data is pasted below:
The model I use is a cross-classified multilevel model wherein age is entered into the fixed part alongside two cross-classified random components for period and cohort. I thus run the following command to estimate:
The standard in the literature when running these models is to then provide a graph of the predicted probabilities for the random components (i.e., period and cohort) in order better to visualize so-called "cohort and period effects". One example of this can be found in Yang (2008, 213) where the author plots, in the case of cohort effects, "the predicted probabilities of [the binary outcome] at the mean age and averaged over all periods". In order to do this, then, I run the following series of post-estimation commands:
We can then go about plotting predicted cohort effects by:
If we then wanted to add a horizontal line at the predicted probabilities of the outcome averaged over (i.e., at the means of) age and period, then we could do:
This gives us a better idea of the substantive size of cohort effects relative to the mean of the outcome averaged over age and period. Running these analyses on the full dataset I end up with the following graph:

An identical procedure could then be used for the period effects by changing the relevant values of interest. I'd be very grateful for any comments, criticisms, or objections to the procedure I outline above. Thank you!
I am a running a hierarchical age-period-cohort model (see Yang & Land 2006) in Stata 15.1. I wish to graph the predicted probabilities of my (binary) outcome of interest for each cohort and period. My understanding is that this can only be achieved at present by manually calculating predicted probabilities from the random effects. Below, I detail how I have done this and ask Statalist whether my approach seems to be correct. If correct, I am minded to write a short program to automate the procedure.
The full dataset I use combines several cross-sectional surveys for a given country conducted at different yearly intervals and has > 7,000 observations. Below, I post a snippet data example (h/t the dataex command). Here, age records the age of the respondent at the time of the survey, period is the year in which the survey was conducted, and cohort is calculated as period of survey minus age, and then recoded into five year brackets.
Code:
clear input long(outcome age) float(period cohort) 0 34 1981 1945 0 55 1981 1925 0 21 1981 1960 0 46 1981 1935 0 50 1981 1930 0 37 1981 1940 1 24 1981 1955 0 36 1981 1945 1 34 1981 1945 0 57 1981 1920 end label values outcome protbin label def protbin 0 "No", modify label def protbin 1 "Yes", modify label values age X003 label values period S020 label def S020 1981 "1981", modify
Code:
sum age period cohort Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- age | 7,297 46.31013 19.07403 15 103 period | 7,297 1997.132 9.552057 1981 2009 cohort | 7,297 1948.848 20.06976 1890 1990
Code:
xtmelogit outcome age || _all:R.period || cohort:
Code:
// First get the predictions for the fixed part (equivalent to the intercept + βage*(mean(age))) predict out, xb //Then predict the random effects predict p coh, reffects //Then get the standard errors of the random parts predict pses cohses, reses //Then get the mean of the period effects (since we want to average over all periods) sum p local mp = r(mean) //Then calculate predicted probabilities for cohort with age and period held at means gen mu = 1 /(1+exp(-1*(out +`mp'+ coh))) //Then calculate the 95% confidence intervals for the cohort effects from the standard errors gen cohci1 = -1.96*cohses gen cohci2 = 1.96*cohses //Then calculate predicted probabilities for upper and lower bounds of the confidence intervals gen muci1 = 1 /(1+exp(-1*(prot + `mp'+ coh + cohci1))) gen muci2 = 1 /(1+exp(-1*(prot + `mp'+ coh + cohci2)))
Code:
preserve collapse mu muci1 muci2, by(cohort) graph twoway rline muci1 muci2 cohort, lstyle(ci) lwidth(.2) lcolor("black") lpattern(dash) || scatter mu cohort, mstyle(p2) msize(.5) aspectratio(1) legend(off) restore
Code:
preserve set scheme lean2 //Get mean predicted values of fixed part and period effects for yline sum out local mout = r(mean) sum p local mp = r(mean) //Calculate predicted probabilities local mprobout = 1 /(1+exp(-1*(`mout'+ `mp'))) set scheme lean2 collapse mu muci1 muci2, by(cohort) graph twoway rline muci1 muci2 cohort, lstyle(ci) lwidth(.2) lcolor("black") /// lpattern(dash) || scatter mu cohort, mstyle(p2) msize(.5) aspectratio(1) legend(off) /// yline(`mprobout', lpattern(dash)) restore
An identical procedure could then be used for the period effects by changing the relevant values of interest. I'd be very grateful for any comments, criticisms, or objections to the procedure I outline above. Thank you!
Comment