Inconsistent p-values /confidence intervals reported by lincom.

Hamish Innes

Join Date: Aug 2024

Posts: 3
#1

Inconsistent p-values /confidence intervals reported by lincom.

14 Aug 2024, 05:39

I have encountered some inconsistencies between p-values and confidence intervals reported by lincom which I do not understand. I would be grateful for advice regarding how my results should be reported.

I have dataset with 2 variables.
alt_hr (relates to a biomarker: 0 is normal, 1 =abnormal).

alt_decile_simple (genetic biomarker: 1 = low ; 2 = intermediate; 3 = high )

I fit a model including both variables plus interaction terms between these two variables.

The model is fit using stcrreg. The outcome is time to disease event.

This is the output I get when I fit the model

stcrreg alt_decile_simple##alt_hr, compet(fail==2)
Competing-risks regression No. of obs = 171,134
No. of subjects = 171,134
Failure event : fail == 1 No. failed = 1,044
Competing event: fail == 2 No. competing = 13,370
No. censored = 156,720

Wald chi2(5) = 864.71
Log pseudolikelihood = -12145.91 Prob > chi2 = 0.0000

(Std. Err. adjusted for 171,134 clusters in eid)
------------------------------------------------------------------------------------------
| Robust
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
alt_decile_simple |
2 | -.0873788 .1286863 -0.68 0.497 -.3395994 .1648418
3 | .0074116 .1757182 0.04 0.966 -.3369897 .351813
|
1.alt_hr | 2.22432 .1955888 11.37 0.000 1.840973 2.607667
|
alt_decile_simple#alt_hr |
2 1 | -.4906818 .2087833 -2.35 0.019 -.8998895 -.0814741
3 1 | -.2740031 .2585625 -1.06 0.289 -.7807763 .2327702

Here is the same model with the coef(legend) option:

. stcrreg, coeflegend

Competing-risks regression No. of obs = 171,134
No. of subjects = 171,134
Failure event : fail == 1 No. failed = 1,044
Competing event: fail == 2 No. competing = 13,370
No. censored = 156,720

Wald chi2(5) = 864.71
Log pseudolikelihood = -12145.91 Prob > chi2 = 0.0000

(Std. Err. adjusted for 171,134 clusters in eid)
------------------------------------------------------------------------------------------
_t | Coef. Legend
-------------------------+----------------------------------------------------------------
alt_decile_simple |
2 | -.0873788 _b[2.alt_decile_simple]
3 | .0074116 _b[3.alt_decile_simple]
|
1.alt_hr | 2.22432 _b[1.alt_hr]
|
alt_decile_simple#alt_hr |
2 1 | -.4906818 _b[2.alt_decile_simple#1.alt_hr]
3 1 | -.2740031 _b[3.alt_decile_simple#1.alt_hr]
------------------------------------------------------------------------------------------

I want to assess how effect of an abnormal biomarker (i.e. alt_hr=1) varies according to alt_decile_simple, based on the model parameters above.
I use lincom to estimate sdHR for people with alt_hr=1 + alt_decile_simple=1 (relative to alt_hr =0 and alt_decile_simple =1)

. lincom _b[1.alt_hr], hr

( 1) [eq1]1.alt_hr = 0

------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 9.247195 1.808648 11.37 0.000 6.302669 13.56736
------------------------------------------------------------------------------

I then use lincom to estimate sdHR for people with alt_hr=1 and alt_decile_simple=2 (again, relative to alt_hr =0 and alt_decile_simple =1)

. lincom (_b[1.alt_hr]+_b[2.alt_decile_simple#1.alt_hr]+_b[2.alt_decile_simple]), hr

( 1) [eq1]2.alt_decile_simple + [eq1]1.alt_hr + [eq1]2.alt_decile_simple#1.alt_hr = 0

------------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 5.18754 .6907153 12.36 0.000 3.995995 6.734386
------------------------------------------------------------------------------

The point estimates above suggest the association between alt_hr and outcome is weaker for people in alt_decile_simple=1 group versus the alt_decile_simple=2 group. However, I want to be able to determine if the difference is statistically significant. I note the 95% confidence intervals for these two estimates are overlapping (i.e. 6.30 to 13.57 and 4.0 to 6.73)., so I am expecting pvalue>0.05

Thus, when I test for equality of these two linear combinations of model parameters using lincom, I am confused why lincom suggest the pvalue = 0.004 (see below).

. test _b[1.alt_hr]=(_b[1.alt_hr]+_b[2.alt_decile_simple#1.alt_hr]+_b[2.alt_decile_simple])

( 1) - [eq1]2.alt_decile_simple - [eq1]2.alt_decile_simple#1.alt_hr = 0

chi2( 1) = 12.36
Prob > chi2 = 0.0004

Is it because this command is not doing what I think it is doing – or is it something broader regarding how standard errors are calculated?

If I use margins command, the point estimates are the same, but the confidence intervals are narrower, and appear more consistent with the pvalue reported by “test”

. margins alt_decile_simple, at(alt_hr=1)

Adjusted predictions Number of obs = 171,134
Model VCE : Robust

Expression : Predicted relative subhazard, predict()
at : alt_hr = 1

-----------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
------------------+----------------------------------------------------------------
alt_decile_simple |
1 | 9.247195 1.808648 5.11 0.000 5.70231 12.79208
2 | 5.18754 .6907153 7.51 0.000 3.833763 6.541317

3 | 7.083221 1.157404 6.12 0.000 4.81475 9.351692

Can you explain this inconsistency? Should I report confidence intervals generated by margins or the confidence intervals generated by lincom? Can I expect the pvalues reported by “test” to be consistent with confidence intervals/standard errors reported by “margins”. Thank you for any help you can give.
Tags: None
Hamish Innes

Join Date: Aug 2024

Posts: 3
#2

19 Aug 2024, 10:56

I havent had a reply to my question. Is there anything i can do to make it clearer? This is my first post so please go easy on me
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29963
#3

19 Aug 2024, 11:54

You aren't doing anything wrong. What is wrong is "I note the 95% confidence intervals for these two estimates are overlapping (i.e. 6.30 to 13.57 and 4.0 to 6.73)., so I am expecting pvalue>0.05." That expectation is unjustified. It is perfectly possible for the 95% confidence intervals to overlap yet the p-value for the difference be < 0.05. Take a look at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/. This is misunderstanding #21 in that article.

With regard to the difference in standard errors from -margins- and from -stcrreg-, the two programs calculate standard errors differently. In -stcrreg- they come from the maximum likelihood estimator of the coefficients; this is also true of -lincom-. In -margins- they are calculated by the "delta method," which is an approximation based on a the linear term of a Taylor series. I would rely on the former, rather than the latter. Although, frankly, I don't see those confidence intervals as being that much different.

I don't think you did anything major wrong in your post. Some people might have been turned off by the poorly formatted Stata output--misaligned and hard to read. The solution to that is to wrap it in code delimiters. If you are not familiar with code delimiters and how to use them here, please read the Forum FAQ, with special attention to #12. I actually saw this one the day it was first posted, but then passed it over just because of the length--I had an especially busy day that day and less time to devote to Statalist. After that, it just lots visibility with time. Bumping it as you did today, and asking what could be done to improve it, was the right thing to do in this case.
Comment
Hamish Innes

Join Date: Aug 2024

Posts: 3
#4

21 Aug 2024, 04:42

Dear Clyde, thanks for your excellent response. I agree my premise was wrong and reading myth #21 has cleared up my misunderstanding. Thanks for your help and advice. Next time i will wrap code/output in code delimiters as you suggest.
Comment

Announcement

Inconsistent p-values /confidence intervals reported by lincom.

Comment

Comment

Comment