Data structure dilemma

Rose Matthews

Join Date: Aug 2023

Posts: 153
#1

Data structure dilemma

30 Jan 2024, 22:54

I came across a paper where the authors compare three different clinical score categories (<70; 70-79; 80-89) to one category (90-100)

generating a hazard ratio following cox regression for each clinical category (<70; 70-79; 80-89) compared to 90-100

With regards to original data structure, I imagine the authors had a data ordinal variable as for eg -score-

1 - <70

2 - 70-79

3- 80-89

4 - 90-100

With regards to data organisation, do you think to perform a cox analysis, would one create a new variable lets say for eg - newscore- where

newscore = 1 for 90-100 (reference variable) and for eg. If looking at scores <70; one can set newscore = 0 for <70

Does one set the rest of the observations (70-70; 80-89) as missing ?

And if so , does one drop the missing observations because if so this reduces the sample size which is what i’m worried about.

if I’m not being clear I’ve written the code below . i wonder if there is another way without going through the extra steps of -newscore- and having to drop observations each time

Code:

gen newcode = . replace newcode = 1 if score <=70 replace newcode = 0 if score >=90 drop if newcode == . stcox newscore covariates /// This generates hazard ratio for those <70 compared to reference category >=90 ///Load dataset again and create HR for 70-79 gen newcode = . replace newcode = 1 if score >=70 & score <=79 replace newcode = 0 if score >=90 drop if newcode == . stcox newscore covariates
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29799
#2

30 Jan 2024, 23:00

Well, I can't say what the authors of that paper did. But the way I would approach this, and I think it is the best way, is:

Code:

gen score_group = 0 if score < 70 replace score_group = 1 if inrange(score, 70, 79) replace score_group = 2 if inrange(score, 80, 89) replace score_group = 3 if inrange(score, 90, .) stcox ib3.score_group covariates

The output will include a line for each of score groups 0 through 2, and those hazard ratios will be relative to score group 3.
1 like
Comment
Rose Matthews

Join Date: Aug 2023

Posts: 153
#3

30 Jan 2024, 23:08

Thank you. I didn’t know about -ib3-
Comment

Announcement

Data structure dilemma

Comment

Comment