Number of participants after Cox regression with listwise deletion of missing data.

Sigrid Vikjord

Join Date: Mar 2018
Posts: 69

Number of participants after Cox regression with listwise deletion of missing data.

26 Nov 2018, 05:07

Dear Statalist,

I am using multivariable Cox regression. My main exposure (TS_WHO_gold) is a categorical variable with three levels (normal, osteopenia, osteoporosis). Several of the covariates in the model have missing data, which are deleted by listwise deletion. My question is: How can I find the count of each exposure category after running the Cox regression? I tried merely -tab- as shown below (2,686), but it doesn't match the number of observations (2,346) in the output from -stcox- (see below).

Code:

. tab TS_WHO_gold if BMIcat_gold!=. & SmoStatPackYrs_gold_missing!=. & SES_gold!=. & COP
> Dcat_gold!=. & physact_gold!=. & alcohol_gold!=. & PartAg_gold!=. & CVD_gold!=. & canc
> er_gold!=. & chrondisADL_gold!=. & diabetes_gold!=. & musc_skel_gold!=.

     WHO BMD |
  categories |
for the HUNT |
 COPD cohort |
       using |
 forearm and |
   total hip |
    DXA meas |      Freq.     Percent        Cum.
-------------+-----------------------------------
      normal |      1,737       64.67       64.67
  osteopenia |        565       21.03       85.70
osteoporosis |        384       14.30      100.00
-------------+-----------------------------------
       Total |      2,686      100.00

Code:

. stcox i.TS_WHO_gold i.Sex ib2.BMIcat_gold i.SmoStatPackYrs_gold_missing i.SES_gold i.C
> OPDcat_gold i.physact_gold i.alcohol_gold c.PartAg_gold i.CVD_gold i.cancer_gold i.chr
> ondisADL_gold i.diabetes_gold i.musc_skel_gold if (goldcopd_HUNT==1 | goldcopd_HUNT==2
> ) & (TS_HUNT!=.) & (PartAg_gold>=40.0 & PartAg_gold<=85.0)

         failure _d:  RegisStat == 5
   analysis time _t:  (enddate-origin)/365.25
             origin:  time PartDat_gold
                 id:  PID_107945

Iteration 0:   log likelihood = -8426.7323
Iteration 1:   log likelihood = -7838.0546
Iteration 2:   log likelihood =  -7797.353
Iteration 3:   log likelihood = -7797.1758
Iteration 4:   log likelihood = -7797.1757
Refining estimates:
Iteration 0:   log likelihood = -7797.1757

Cox regression -- Breslow method for ties

No. of subjects =        2,346                  Number of obs    =       2,346
No. of failures =        1,187
Time at risk    =  27880.69268
                                                LR chi2(29)      =     1259.11
Log likelihood  =   -7797.1757                  Prob > chi2      =      0.0000

Best regards,
Sigrid

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 29794
#2

26 Nov 2018, 10:27

After running the Cox regression you can do this:

Code:

tab TS_WHO_gold if e(sample)
Comment
Sigrid Vikjord

Join Date: Mar 2018

Posts: 69
#3

26 Nov 2018, 12:19

Great, thank you!
Comment
Christos Chalitsios

Join Date: Jun 2024

Posts: 3
#4

27 Jun 2024, 14:53

In continuation to the above question, how can I see the number of events (failures) by tertile of the exposure after running the Cox model, as the covariates have some missing data? Much appreciated!!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29794
#5

27 Jun 2024, 15:17

Code:

xtile tercile = exposure_variable if e(sample), nq(3) tabstat (sum) _d if e(sample), by(tercile)

That said, where are you going with this? The number of failures in each exposure tertile strikes me as a statistic that is probably meaningless and readily subject to misleading interpretations. If you say what your trying to figure out about your analysis, someone may be able to offer a better way to go about it.
Comment
Christos Chalitsios

Join Date: Jun 2024

Posts: 3
#6

28 Jun 2024, 00:57

Thank you. I am using some blood biomarkers (e.g., LDL-c) as exposure, and I created tertiles to examine if there is a specific category that impacts more than others (actually to see if there is a non-linearity) on the outcome (a disease).
The command that I found suitable regarding my above question is:

Code:

tab exposure_var_tertile _d if e(sample)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29794
#7

28 Jun 2024, 10:17

My concern in #5 was not about the tertiles of the exposure variable but about getting counts of failure events. How are you going to use those results? The number of failures in a group is not, by itself, useful information. It is only interpretable in the context of both the time at risk associated with those failures and the number of censored observations as well. Comparing number of failures in the tertiles without some appropriate accounting for these other phenomena can lead to very misleading conclusions. Proceed with extreme caution.
Comment

Announcement

Number of participants after Cox regression with listwise deletion of missing data.

Comment

Comment

Comment

Comment

Comment

Comment