Difference in N of obs. in Cox model

Hayoung Choi

Join Date: Nov 2021
Posts: 14

Difference in N of obs. in Cox model

12 Jan 2022, 02:02

Hi,

I have a problem regarding the difference in the number of observations between the Cox model and the frequency analysis by year.

Here, I have a data set of the 2009 to 2019 wave (year).

Year	Freq.	Percent	Cum.

2009	104	11.30	11.30
2010	95	10.33	21.63
2011	95	10.33	31.96
2012	91	9.89	41.85
2013	86	9.35	51.20
2014	81	8.80	60.00
2015	77	8.37	68.37
2016	76	8.26	76.63
2017	74	8.04	84.67
2018	72	7.83	92.50
2019	69	7.50	100.00

Total	920	100.00

As you can see from above, the maximum number of observations was 104 in 2009.

However, when I run the Cox model the number of observations is bigger.
Can somebody explain why this could happen?

The dependant variable is the exit of medical benefit (coded as medical_final)

medical_final
	Freq.	Percent	Cum.

0	675	73.37	73.37
1	245	26.63	100.00

Total	920	100.00

. stset tf, failure(medical_final=1)
failure event: medical_final == 1
obs. time interval: (0, tf]
exit on or before: failure

920 total observations
0 exclusions

920 observations remaining, representing
245 failures in single-record/single-failure data
5,151 total analysis time at risk and under observation
at risk from t	=	0
earliest observed entry t	=	0
last observed exit t	=	11
.

. stcox i.sex age i.married householdmembers yearsofshooling i.chronic_disease i.have_disability alcoholqualtity depression i.jobtype i.jobstable i.fulltimejob totalcostofliving, nohr

failure _d: medical_final == 1
analysis time _t: tf

Iteration 0: log likelihood = -331.49314
Iteration 1: log likelihood = -314.58647
Iteration 2: log likelihood = -312.79623
Iteration 3: log likelihood = -312.78731
Iteration 4: log likelihood = -312.78731
Refining estimates:
Iteration 0: log likelihood = -312.78731

Cox regression -- Breslow method for ties

No. of subjects = 296 Number of obs = 296
No. of failures = 77
Time at risk = 1473
LR chi2(17) = 37.41
Log likelihood = -312.78731 Prob > chi2 = 0.0030

_t	Coef.	Std. Err.	z	P>z	[95% Conf.	Interval]

sex
women	-.5941747	.4756698	-1.25	0.212	-1.52647	.3381209
age	-.0787584	.0431815	-1.82	0.068	-.1633926	.0058758
1.married	-.3554892	.5308085	-0.67	0.503	-1.395855	.6848763
householdmembers	.1617833	.1848749	0.88	0.382	-.2005648	.5241314
yearsofshooling	.0038488	.0603173	0.06	0.949	-.114371	.1220686

chronic_disease
1	-.2000307	1.044555	-0.19	0.848	-2.247322	1.84726
2	-.6337595	1.04773	-0.60	0.545	-2.687273	1.419754
3	-.3988995	.2983077	-1.34	0.181	-.9835718	.1857728

1.have_disability	.2235872	.3358214	0.67	0.506	-.4346106	.8817849
alcoholqualtity	.0866133	.0932219	0.93	0.353	-.0960983	.2693248
depression	-.1622675	.2658865	-0.61	0.542	-.6833954	.3588604

jobtype
2	-1.194072	.5009966	-2.38	0.017	-2.176007	-.2121367
3	-1.248776	.5771651	-2.16	0.030	-2.379999	-.1175533
4	-2.392911	.667759	-3.58	0.000	-3.701694	-1.084127

1.jobstable	-.5512276	.3898746	-1.41	0.157	-1.315368	.2129126
1.fulltimejob	.1312842	.2975419	0.44	0.659	-.4518872	.7144557
totalcostofliving	-.0011913	.0012785	-0.93	0.351	-.0036972	.0013145

Last edited by Hayoung Choi; 12 Jan 2022, 02:05.

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17606
#2

12 Jan 2022, 02:46

Hayoung:
your started from 920 observations.
However -stcox- No. of subjects = 296 includes those who fail (77) and those who don't (296-77).
It seems to be one observation per subject.
The difference between 920 and 296 might be due to missing values in any of the covariates that Stata manages vis listwise deletion (ie, ruling out from -stcox- the corresponding observations).

Kind regards,
Carlo
(StataNow 18.5)
Comment
Hayoung Choi

Join Date: Nov 2021

Posts: 14
#3

12 Jan 2022, 06:21

Originally posted by Carlo Lazzaro View Post

Hayoung:
your started from 920 observations.
However -stcox- No. of subjects = 296 includes those who fail (77) and those who don't (296-77).
It seems to be one observation per subject.
The difference between 920 and 296 might be due to missing values in any of the covariates that Stata manages vis listwise deletion (ie, ruling out from -stcox- the corresponding observations).

Oh, I see. Thank you for your kind response!
Comment

Announcement

Difference in N of obs. in Cox model

Comment

Comment