
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data structure dilemma

    I came across a paper where the authors compare three different clinical score categories (<70; 70-79; 80-89) to one category (90-100)

    generating a hazard ratio following cox regression for each clinical category (<70; 70-79; 80-89) compared to 90-100

    With regards to original data structure, I imagine the authors had a data ordinal variable as for eg -score-

    1 - <70

    2 - 70-79

    3- 80-89

    4 - 90-100

    With regards to data organisation, do you think to perform a cox analysis, would one create a new variable lets say for eg - newscore- where

    newscore = 1 for 90-100 (reference variable) and for eg. If looking at scores <70; one can set newscore = 0 for <70

    Does one set the rest of the observations (70-70; 80-89) as missing ?

    And if so , does one drop the missing observations because if so this reduces the sample size which is what i’m worried about.

    if I’m not being clear I’ve written the code below . i wonder if there is another way without going through the extra steps of -newscore- and having to drop observations each time

    gen newcode = .
    replace newcode = 1 if score <=70
    replace newcode = 0 if score >=90
    drop if newcode == .
    stcox newscore covariates
    /// This generates hazard ratio for those <70 compared to reference category >=90
    ///Load dataset again and create HR for 70-79
    gen newcode = .
    replace newcode = 1 if score >=70 & score <=79
    replace newcode = 0 if score >=90
    drop if newcode == .
    stcox newscore covariates

  • #2
    Well, I can't say what the authors of that paper did. But the way I would approach this, and I think it is the best way, is:

    gen score_group = 0 if score < 70
    replace score_group = 1 if inrange(score, 70, 79)
    replace score_group = 2 if inrange(score, 80, 89)
    replace score_group = 3 if inrange(score, 90, .)
    stcox ib3.score_group covariates
    The output will include a line for each of score groups 0 through 2, and those hazard ratios will be relative to score group 3.


    • #3
      Thank you. I didn’t know about -ib3-

