Missing upper bound of 95% confidence interval on stci median (Kaplan-Meier estimator) despite sufficient data

Andrew Arthur

Join Date: Feb 2022

Posts: 7
#1

Missing upper bound of 95% confidence interval on stci median (Kaplan-Meier estimator) despite sufficient data

15 Feb 2022, 20:12

When extracting an estimate of the median time to failure using a Kaplan Meier estimator (in failure format), I am getting 95% confidence interval outputs that are missing an upper bound.

To obtain this median, I am using the basic survival code:

Code:

stset time, id(participantid) failure (fail == 1) origin(min) scale(30.437) stci

For the following reasons, I cannot determine why the upper limit of the 95% confidence interval would be missing:
The analysis is limited to participants who experience the failure; there is no censoring and median failure must be reached.

The time at risk in the analysis extends far above the median survival estimate which should leave room in for an upper confidence interval bound.

In the stci manual (https://www.stata.com/manuals/ststci.pdf), I am unable to find specifics of what methods are used to calculate the 95% confidence interval for the median, though it appears to be based on a non-parametric formula.

Would anyone be able to explain what might be causing the missing upper bound?
Tags: confidence interval, kaplan-meier, median, stci, survival
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2389
#2

15 Feb 2022, 21:30

You are correct that this is using a non-parametric method, it is using Kaplan-Meier (product-limit) estimates. It is explained in the PDF manual following -help stci- in the section called "Methods and Formulas".

Let's create fake data using the same tiny sample size, n=6. The distibtution of failure times here is immaterial for our purposes.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte(id t fail) 1 2 1 2 4 1 3 5 1 4 8 1 5 8 1 6 10 1 end

Now let's stset the data, note all failures. I will also list out the Kaplan-Meier survival estimates, along with the default -stci- output which estimates the median survival time.

Code:

. stset t, fail(fail) id(id) . sts list Failure _d: fail Analysis time _t: t ID variable: id Kaplan–Meier survivor function At Net Survivor Std. Time risk Fail lost function error [95% conf. int.] ------------------------------------------------------------------------ 2 6 1 0 0.8333 0.1521 0.2731 0.9747 4 5 1 0 0.6667 0.1925 0.1946 0.9044 5 4 1 0 0.5000 0.2041 0.1109 0.8037 8 3 2 0 0.1667 0.1521 0.0077 0.5168 10 1 1 0 0.0000 . . . ------------------------------------------------------------------------ Note: Net lost equals the number lost minus the number who entered. . stci Failure _d: fail Analysis time _t: t ID variable: id | Number of | subjects 50% Std. err. [95% conf. interval] -------------+------------------------------------------------------------- Total | 6 5 1.632993 2 .

The percentiles of surivival time are a function of the Kaplan-Meier survival estimates. The standard errors listed in the the KM estimates are based on Greenwood's method, but these are not used for computation of the confidence interval for KM survival estimates because of numerical issues (namely, they can result in invalid values outside the range of [0, 1], and are inefficient compared to other methods). Instead, those confidence intervals are derived from maximum likelihood estimators for the confidence interval of ln(-ln(S(t))). The logic is similar for the standard errors and confidence intervals for estimates of the pth percentile of survival time.

From the manual

For a given confidence level, the upper confidence limit for the pth percentile is defined as the first time at which the upper confidence limit for S(t) (based on a ln(-lnS(t)) transformation) is less than or equal to 1-p/100, and, similarly, the lower confidence limit is defined as the first time at which the lower confidence limit of S(t) is less than or equal to 1 - p/100.

From this logic, to estimate the median (50th percentile) of the failure time, the failure time in which the lower bound of the KM survival estimate is less than 0.5 (= 1-50/100) is t=2. So t=2 is the lower bound of the the confidence interval for the median survival time. However, there is never an observed failure time for which the upper bound of the confidence interval for the KM estimate is less than 0.5, despite coming very close in this example. The result is the upper bound is missing.

If we were to slightly change our request, and ask for the 48th percentile of survival time and its confidence interval, we no see that both lower and upper bound are present. (This of course is to demonstrate what is happening, and I am not advocating that you do this in your example to "fudge" a confidence interval.)

Code:

. stci, p(48) Failure _d: fail Analysis time _t: t ID variable: id | Number of | subjects 48% Std. err. [95% conf. interval] -------------+------------------------------------------------------------- Total | 6 5 1.632993 2 8

Last edited by Leonardo Guizzetti; 15 Feb 2022, 21:34.
1 like
Comment
Andrew Arthur

Join Date: Feb 2022

Posts: 7
#3

30 May 2022, 15:51

Thank you for your detailed explanation of this concept, Dr. Guizzetti; this was immensely helpful and clarifying.

Sincerely,
Andrew
Comment

Announcement

Missing upper bound of 95% confidence interval on stci median (Kaplan-Meier estimator) despite sufficient data

Comment

Comment