I have encountered some inconsistencies between p-values and confidence intervals reported by lincom which I do not understand. I would be grateful for advice regarding how my results should be reported.
I have dataset with 2 variables.
The model is fit using stcrreg. The outcome is time to disease event.
This is the output I get when I fit the model
stcrreg alt_decile_simple##alt_hr, compet(fail==2)
Competing-risks regression No. of obs = 171,134
No. of subjects = 171,134
Failure event : fail == 1 No. failed = 1,044
Competing event: fail == 2 No. competing = 13,370
No. censored = 156,720
Wald chi2(5) = 864.71
Log pseudolikelihood = -12145.91 Prob > chi2 = 0.0000
(Std. Err. adjusted for 171,134 clusters in eid)
------------------------------------------------------------------------------------------
| Robust
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
alt_decile_simple |
2 | -.0873788 .1286863 -0.68 0.497 -.3395994 .1648418
3 | .0074116 .1757182 0.04 0.966 -.3369897 .351813
|
1.alt_hr | 2.22432 .1955888 11.37 0.000 1.840973 2.607667
|
alt_decile_simple#alt_hr |
2 1 | -.4906818 .2087833 -2.35 0.019 -.8998895 -.0814741
3 1 | -.2740031 .2585625 -1.06 0.289 -.7807763 .2327702
Here is the same model with the coef(legend) option:
. stcrreg, coeflegend
Competing-risks regression No. of obs = 171,134
No. of subjects = 171,134
Failure event : fail == 1 No. failed = 1,044
Competing event: fail == 2 No. competing = 13,370
No. censored = 156,720
Wald chi2(5) = 864.71
Log pseudolikelihood = -12145.91 Prob > chi2 = 0.0000
(Std. Err. adjusted for 171,134 clusters in eid)
------------------------------------------------------------------------------------------
_t | Coef. Legend
-------------------------+----------------------------------------------------------------
alt_decile_simple |
2 | -.0873788 _b[2.alt_decile_simple]
3 | .0074116 _b[3.alt_decile_simple]
|
1.alt_hr | 2.22432 _b[1.alt_hr]
|
alt_decile_simple#alt_hr |
2 1 | -.4906818 _b[2.alt_decile_simple#1.alt_hr]
3 1 | -.2740031 _b[3.alt_decile_simple#1.alt_hr]
------------------------------------------------------------------------------------------
I want to assess how effect of an abnormal biomarker (i.e. alt_hr=1) varies according to alt_decile_simple, based on the model parameters above.
I use lincom to estimate sdHR for people with alt_hr=1 + alt_decile_simple=1 (relative to alt_hr =0 and alt_decile_simple =1)
. lincom _b[1.alt_hr], hr
( 1) [eq1]1.alt_hr = 0
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 9.247195 1.808648 11.37 0.000 6.302669 13.56736
------------------------------------------------------------------------------
I then use lincom to estimate sdHR for people with alt_hr=1 and alt_decile_simple=2 (again, relative to alt_hr =0 and alt_decile_simple =1)
. lincom (_b[1.alt_hr]+_b[2.alt_decile_simple#1.alt_hr]+_b[2.alt_decile_simple]), hr
( 1) [eq1]2.alt_decile_simple + [eq1]1.alt_hr + [eq1]2.alt_decile_simple#1.alt_hr = 0
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 5.18754 .6907153 12.36 0.000 3.995995 6.734386
------------------------------------------------------------------------------
The point estimates above suggest the association between alt_hr and outcome is weaker for people in alt_decile_simple=1 group versus the alt_decile_simple=2 group. However, I want to be able to determine if the difference is statistically significant. I note the 95% confidence intervals for these two estimates are overlapping (i.e. 6.30 to 13.57 and 4.0 to 6.73)., so I am expecting pvalue>0.05
Thus, when I test for equality of these two linear combinations of model parameters using lincom, I am confused why lincom suggest the pvalue = 0.004 (see below).
. test _b[1.alt_hr]=(_b[1.alt_hr]+_b[2.alt_decile_simple#1.alt_hr]+_b[2.alt_decile_simple])
( 1) - [eq1]2.alt_decile_simple - [eq1]2.alt_decile_simple#1.alt_hr = 0
chi2( 1) = 12.36
Prob > chi2 = 0.0004
Is it because this command is not doing what I think it is doing – or is it something broader regarding how standard errors are calculated?
If I use margins command, the point estimates are the same, but the confidence intervals are narrower, and appear more consistent with the pvalue reported by “test”
. margins alt_decile_simple, at(alt_hr=1)
Adjusted predictions Number of obs = 171,134
Model VCE : Robust
Expression : Predicted relative subhazard, predict()
at : alt_hr = 1
-----------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
------------------+----------------------------------------------------------------
alt_decile_simple |
1 | 9.247195 1.808648 5.11 0.000 5.70231 12.79208
2 | 5.18754 .6907153 7.51 0.000 3.833763 6.541317
3 | 7.083221 1.157404 6.12 0.000 4.81475 9.351692
Can you explain this inconsistency? Should I report confidence intervals generated by margins or the confidence intervals generated by lincom? Can I expect the pvalues reported by “test” to be consistent with confidence intervals/standard errors reported by “margins”. Thank you for any help you can give.
I have dataset with 2 variables.
- alt_hr (relates to a biomarker: 0 is normal, 1 =abnormal).
- alt_decile_simple (genetic biomarker: 1 = low ; 2 = intermediate; 3 = high )
The model is fit using stcrreg. The outcome is time to disease event.
This is the output I get when I fit the model
stcrreg alt_decile_simple##alt_hr, compet(fail==2)
Competing-risks regression No. of obs = 171,134
No. of subjects = 171,134
Failure event : fail == 1 No. failed = 1,044
Competing event: fail == 2 No. competing = 13,370
No. censored = 156,720
Wald chi2(5) = 864.71
Log pseudolikelihood = -12145.91 Prob > chi2 = 0.0000
(Std. Err. adjusted for 171,134 clusters in eid)
------------------------------------------------------------------------------------------
| Robust
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
alt_decile_simple |
2 | -.0873788 .1286863 -0.68 0.497 -.3395994 .1648418
3 | .0074116 .1757182 0.04 0.966 -.3369897 .351813
|
1.alt_hr | 2.22432 .1955888 11.37 0.000 1.840973 2.607667
|
alt_decile_simple#alt_hr |
2 1 | -.4906818 .2087833 -2.35 0.019 -.8998895 -.0814741
3 1 | -.2740031 .2585625 -1.06 0.289 -.7807763 .2327702
Here is the same model with the coef(legend) option:
. stcrreg, coeflegend
Competing-risks regression No. of obs = 171,134
No. of subjects = 171,134
Failure event : fail == 1 No. failed = 1,044
Competing event: fail == 2 No. competing = 13,370
No. censored = 156,720
Wald chi2(5) = 864.71
Log pseudolikelihood = -12145.91 Prob > chi2 = 0.0000
(Std. Err. adjusted for 171,134 clusters in eid)
------------------------------------------------------------------------------------------
_t | Coef. Legend
-------------------------+----------------------------------------------------------------
alt_decile_simple |
2 | -.0873788 _b[2.alt_decile_simple]
3 | .0074116 _b[3.alt_decile_simple]
|
1.alt_hr | 2.22432 _b[1.alt_hr]
|
alt_decile_simple#alt_hr |
2 1 | -.4906818 _b[2.alt_decile_simple#1.alt_hr]
3 1 | -.2740031 _b[3.alt_decile_simple#1.alt_hr]
------------------------------------------------------------------------------------------
I want to assess how effect of an abnormal biomarker (i.e. alt_hr=1) varies according to alt_decile_simple, based on the model parameters above.
I use lincom to estimate sdHR for people with alt_hr=1 + alt_decile_simple=1 (relative to alt_hr =0 and alt_decile_simple =1)
. lincom _b[1.alt_hr], hr
( 1) [eq1]1.alt_hr = 0
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 9.247195 1.808648 11.37 0.000 6.302669 13.56736
------------------------------------------------------------------------------
I then use lincom to estimate sdHR for people with alt_hr=1 and alt_decile_simple=2 (again, relative to alt_hr =0 and alt_decile_simple =1)
. lincom (_b[1.alt_hr]+_b[2.alt_decile_simple#1.alt_hr]+_b[2.alt_decile_simple]), hr
( 1) [eq1]2.alt_decile_simple + [eq1]1.alt_hr + [eq1]2.alt_decile_simple#1.alt_hr = 0
------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 5.18754 .6907153 12.36 0.000 3.995995 6.734386
------------------------------------------------------------------------------
The point estimates above suggest the association between alt_hr and outcome is weaker for people in alt_decile_simple=1 group versus the alt_decile_simple=2 group. However, I want to be able to determine if the difference is statistically significant. I note the 95% confidence intervals for these two estimates are overlapping (i.e. 6.30 to 13.57 and 4.0 to 6.73)., so I am expecting pvalue>0.05
Thus, when I test for equality of these two linear combinations of model parameters using lincom, I am confused why lincom suggest the pvalue = 0.004 (see below).
. test _b[1.alt_hr]=(_b[1.alt_hr]+_b[2.alt_decile_simple#1.alt_hr]+_b[2.alt_decile_simple])
( 1) - [eq1]2.alt_decile_simple - [eq1]2.alt_decile_simple#1.alt_hr = 0
chi2( 1) = 12.36
Prob > chi2 = 0.0004
Is it because this command is not doing what I think it is doing – or is it something broader regarding how standard errors are calculated?
If I use margins command, the point estimates are the same, but the confidence intervals are narrower, and appear more consistent with the pvalue reported by “test”
. margins alt_decile_simple, at(alt_hr=1)
Adjusted predictions Number of obs = 171,134
Model VCE : Robust
Expression : Predicted relative subhazard, predict()
at : alt_hr = 1
-----------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
------------------+----------------------------------------------------------------
alt_decile_simple |
1 | 9.247195 1.808648 5.11 0.000 5.70231 12.79208
2 | 5.18754 .6907153 7.51 0.000 3.833763 6.541317
3 | 7.083221 1.157404 6.12 0.000 4.81475 9.351692
Can you explain this inconsistency? Should I report confidence intervals generated by margins or the confidence intervals generated by lincom? Can I expect the pvalues reported by “test” to be consistent with confidence intervals/standard errors reported by “margins”. Thank you for any help you can give.
Comment