Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using weights in survival analysis correctly

    I am in the process of setting weights for Survival Analysis. The analysis is akin to an RDD estimation where I use a triangular kernel to compute the weights around a cutoff date (I'm splitting starting before and after this date). I am wondering whether my current approach calculates and uses the weights correctly or whether I am messing up in the process. If I am messing up, I would be grateful for pointers to improve since I'm not well versed in survival analysis!

    In particular, I am interested in when individuals leave a firm in the years 6-8 of employment.

    I first create a normalized time variable and generate a censoring so observations who survive past 8 years don't affect my estimates:

    Code:
    gen normalized_time = duration-6*12
    
    assert duration >= 6 * 12 // true, I already excluded these observations when I created the dataset
    gen failure_censored = (duration < 8*12)
    replace normalized_time = min(normalized_time, 24)
    I then calculate the weights:

    Code:
    local bwidth = 24
    local cutoff = 0
    
    tempvar h x_l u K_u w bandwidth
    
    gen byte `bandwidth' = 1 if inrange(x-`cutoff',-`bwidth',+`bwidth')
    
    gen float `h'            = `bwidth'
    gen float `x_l'         = 0
    gen float `u'            = (x-`x_l')/`h'
    gen float `K_u'         = (1-abs(`u'))
    gen float `w'             = 0                 if  abs(`u')> 1
    replace `w'             = 1/`h' * `K_u'     if abs(`u')<=1
    Then I set my dataset to be survival type and calculate a cox model using some covariates (e.g. female). I checked and ascertained that the covariates are non-missing for all observations. In a LPM I would have used the aw options, but as far as I understand the documentation, stset does not support aw.

    Code:
    stset normalized_time [iw=`w], failure(failure_censored)
    
    stcox female after
    
    stcurve, cumhaz at(after=(0 1))
    The weights w I calculated are between 0 and a maximum of 4%. I have approximately 10k observations in total. The sum of all weights comes out to 254.03 only though (which is provided as the number of observations when Stata reports the results of the cox model), which makes me wonder if I implemented the weights correctly.

    I was also wondering whether I can have the stcurve command create 95% C.I.s for the curves, or if there is another way to easily compute 95% C.I.s for the predictions.

    Thank you all!
Working...
X