Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • STCOX after STSET with pweight is reporting 'no. of subjects' as the 'sum of wgt.'

    Hello - Hopefully someone can shed some light on this, I have seen it mentioned on the board but no explanation of why.

    I have a dataset 1 row per subject and am running STCOX models using Entropy Balancing weights generated by EBALANCE.

    The results conceptually make sense based on the unweighted results and exploration of the data, however in the STCOX output the 'No. of subjects' is being reported as the 'sum of wgt'.

    I have no ID variable in my STSET command so cannot figure out why this is happening.

    Unfortunately I cannot share the data.

    Any thoughts would be greatly appreciated.

    /*** OUTPUT BELOW ***/

    . stset sre_daysto_365 [pweight = ewgt_tx2_tx1] if tx == 1 | tx == 2, failure(sre_bin_365 == 1)

    Survival-time data settings

    Failure event: sre_bin_365==1
    Observed time interval: (0, sre_daysto_365]
    Exit on or before: failure
    Weight: [pweight=ewgt_tx2_tx1]
    Keep observations
    if exp: tx == 1 | tx == 2

    --------------------------------------------------------------------------
    455,422 total observations
    193,539 ignored at outset because of if exp
    --------------------------------------------------------------------------
    261,883 observations remaining, representing
    14,777 failures in single-record/single-failure data
    90410908 total analysis time at risk and under observation
    At risk from t = 0
    Earliest observed entry t = 0
    Last observed exit t = 365

    ...

    . stcox i.tx age ageatseparation ///
    > bin_sex_1 bin_sex_2 /// ref. bin_sex_1_0 Male
    > bin_raceeth_1 bin_raceeth_2 bin_raceeth_3 bin_raceeth_4 bin_raceeth_5 bin_raceeth_6 bin_raceeth_7 /// ref. bin_raceeth_0 White
    > bin_maritalstat_1 bin_maritalstat_2 bin_maritalstat_3 /// ref. bin_maritalstat_0 Married
    > bin_branchofser_1 bin_branchofser_2 bin_branchofser_3 bin_branchofser_4 /// ref. bin_branchofser_0 Army
    > op_mh_n_pri ip_mh_n_pri sre_pre_n homeless psychoses comorbidity_s i.vcl_yr i.vcl_mo i.sta3n

    Failure _d: sre_bin_365==1
    Analysis time _t: sre_daysto_365
    Weight: [pweight=ewgt_tx2_tx1]

    (sum of wgt is 16,866)


    Cox regression with Breslow method for ties

    No. of subjects = 16,866 Number of obs = 261,883
    No. of failures = 1,444
    Time at risk = 5,721,158.9
    Wald chi2(168) = 1595.86
    Log pseudolikelihood = -13516.335 Prob > chi2 = 0.0000



  • #2
    Actually I found a good writeup on this and what I am seeing is expected - in the weighted model the 'No. of subjects' is reported as the sum of the weights. I am thinking that these weights are not correct for this model and am going to try alternate (IPTW or other) methods.

    Comment


    • #3
      It would be helpful for others if you could share a link to the article you have referenced. Conceptually, "sum of weights" is more appropriate as a general term because the weighting no longer means that your observations represent individuals. (Note: this doesn't apply to frequency weights which represent multiple copies of individual observations.)

      Comment

      Working...
      X