Hello,
I am currently working with register data in the Nordics to estimate first union (cohabitation or marriage) formation since the age of 18. Since I cannot post this data or code, I have tried to replicate an example of my question as close as possible with the toy data set below.
The actual dataset includes around 470,000 observations, spans 32 years starting from the age of 18 in 1987, is in long format with a time varying-variable educational attainment that changes as people age through the educational system (this variable has 4 categories) and urban/rural which changes as people internally migrate (binary), and includes time-invariant variables for immigration status (for example, 1st generation, this has 5 categories) and the sending region for immigrants (for example, north america, this variable has 8 categories).
The actual model specification is union=i.immigration_status##i.education urban i.sending_region, tvc(immigration_status) texp(ln(_t)), however I've manually generated the time interaction to replace the tvc option (per the example below) in order to be able to use post estimation commands.
My first question is if one uses margins, at (as in the example below), is stata outputting hazard ratios or hazard rate for each marginal output? I've read most of the entries I could find for stcox and margins, plus several textbooks, and there seem to be conflicting accounts of this.
Similarly, if I use margins, dydx(var1 var2) (see below), which I would interpret as a derivative in an OLS context, should the output in a cox formulation be thought about as the derivative for the hazard rate or ratio?
I understand that the marginsplot command says these are hazard ratios (and "effects on predicted hazard ratios" for the dydx option), but since I've seen conflicting reports, it would be so great if someone could verify this. Thank you so much for all of your help! Also, I had some trouble with the formatting, so I apologize if it is not coming out correctly.
I am currently working with register data in the Nordics to estimate first union (cohabitation or marriage) formation since the age of 18. Since I cannot post this data or code, I have tried to replicate an example of my question as close as possible with the toy data set below.
The actual dataset includes around 470,000 observations, spans 32 years starting from the age of 18 in 1987, is in long format with a time varying-variable educational attainment that changes as people age through the educational system (this variable has 4 categories) and urban/rural which changes as people internally migrate (binary), and includes time-invariant variables for immigration status (for example, 1st generation, this has 5 categories) and the sending region for immigrants (for example, north america, this variable has 8 categories).
The actual model specification is union=i.immigration_status##i.education urban i.sending_region, tvc(immigration_status) texp(ln(_t)), however I've manually generated the time interaction to replace the tvc option (per the example below) in order to be able to use post estimation commands.
My first question is if one uses margins, at (as in the example below), is stata outputting hazard ratios or hazard rate for each marginal output? I've read most of the entries I could find for stcox and margins, plus several textbooks, and there seem to be conflicting accounts of this.
Similarly, if I use margins, dydx(var1 var2) (see below), which I would interpret as a derivative in an OLS context, should the output in a cox formulation be thought about as the derivative for the hazard rate or ratio?
Code:
stset month, failure(union)
gen monthtvc=month*(ln(_t))
stcox i.immigrant_origin##i.education i.urban monthtvc, nohr
Code:
Code:
margins, at (immigrant_origin=(1 2 3 4 5) education=(0 1 2 3)) marginsplot ------------------------------------------------------------------------------ | Delta-method | Margin std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- _at | 1 | .0338928 .0036831 9.20 0.000 .0266741 .0411114 2 | .0433713 .0189182 2.29 0.022 .0062924 .0804503 3 | .042719 .0186365 2.29 0.022 .0061921 .0792458 4 | .0482648 .0220906 2.18 0.029 .0049679 .0915616 5 | .0588055 .0291649 2.02 0.044 .0016434 .1159677 6 | .050708 .0222374 2.28 0.023 .0071234 .0942926 7 | .0481593 .0219243 2.20 0.028 .0051884 .0911301 8 | .0544829 .0274042 1.99 0.047 .0007717 .1081942 9 | .036445 .0188413 1.93 0.053 -.0004833 .0733733 10 | .050337 .0219795 2.29 0.022 .007258 .0934159 11 | .0434282 .0193442 2.25 0.025 .0055142 .0813422 12 | .0646858 .0305404 2.12 0.034 .0048277 .124544 13 | .0305171 .0332355 0.92 0.359 -.0346233 .0956574 14 | .044052 .0219942 2.00 0.045 .000944 .0871599 15 | .0195036 .0161583 1.21 0.227 -.0121661 .0511733 16 | .0199751 .0217104 0.92 0.358 -.0225764 .0625266 17 | .0387859 .0254598 1.52 0.128 -.0111143 .0886861 18 | .0597272 .0282598 2.11 0.035 .004339 .1151155 19 | .0334007 .0197175 1.69 0.090 -.0052448 .0720463 20 | .0699757 .0433642 1.61 0.107 -.0150165 .154968 ------------------------------------------------------------------------------margins, dydx( immigrant_origin education) Average marginal effects Number of obs = 4,139 Model VCE: OIM Expression: Predicted hazard ratio, predict() dy/dx wrt: 2.immigrant_origin 3.immigrant_origin 4.immigrant_origin 5.immigrant_origin 1.education 2.education 3.education ---------------------------------------------------------------------------------- | Delta-method | dy/dx std. err. z P>|z| [95% conf. interval] -----------------+---------------------------------------------------------------- immigrant_origin | aliba | .006578 .0070079 0.94 0.348 -.0071572 .0203132 alur | .0064049 .0067681 0.95 0.344 -.0068602 .0196701 aringa | -.0175955 .0119503 -1.47 0.141 -.0410176 .0058265 baamba | .005765 .0114968 0.50 0.616 -.0167684 .0282984 | education | primary | .0075674 .0122454 0.62 0.537 -.0164331 .0315678 secondary | .0034561 .0107987 0.32 0.749 -.017709 .0246212 higher | .0132735 .0156465 0.85 0.396 -.017393 .04394 ---------------------------------------------------------------------------------- Note: dy/dx for factor levels is the discrete change from the base level.