Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adjusting for censoring in survival analysis

    Hi

    I'm doing a survival analysis of interfirm relationships and having trouble in understanding how Stata deals with censoring. I understand the concept of censoring and my data have both left and right censoring. My data starts in 2010 and ends in 2017, covering 7 years. I have relationships lasting until the last date of the observation window, and it is not clear whether these relationships continue to survive after the last date or not. Thus, they are right censored. I also have relationships from the first date, meaning that they have started before or exactly at the first date my observation window. Thus, these are left-censored observations.

    I have read a paper by Ongena and Smith (2001), [Ongena S., David Smith, 2001, The duration of bank relationships, Journal of Financial Economics 61:449-475], where they are talking about adjustments for right and left censoring in non-parametric, semi-parametric and parametric models. Can anybody explain whether Stata does these adjustments automatically or if there are any options with built-in commands? I want to do non-parametric, semi-parametric and parametric estimations in Stata.

    Thanks in advance.
    Last edited by Parviz Alizada; 25 Oct 2017, 07:51.

  • #2
    Welcome to the Stata Forum / Statlist,

    The options - start - and - end - for the - stset - command may be helpful to you.
    Best regards,

    Marcos

    Comment


    • #3
      Parviz:
      in echoing Marcos' welcome to Stata forum, I would also recommend you to provide full reference for the article you mentioned (http://www.sciencedirect.com/science...04405X01000691).
      In my opinion, this seemingly pedantic FAQ requirement rests on three considerations:
      - people who are willing to help you may find annoying/impossible to spot the article you mentioned;
      - what we mention in our queries may well be useful for other listers/guests for different purposes: hence, full reference can really save time;
      - this is multidisciplinary forum (and this is a great advantage): hence, it is pretty unlikely that all the listers know all the literature of all the research fields we come from.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Thanks for warm welcome, Marcos and Carlo.

        Marcos thanks for the suggestion. If you could please explain how start and end options handle the censoring I'd be grateful. Any references appreciated.

        Carlo, thanks for mentioning. I have included the full reference.

        Comment


        • #5
          A note on terminology: relationships that started before the observation window are left truncated, not left censored. For these you need to know how long the relationship had lasted when you started observation. Consider two relationships that started in 2000 and lasted 6 and 12 years respectively. The first one never makes it to your observation window. For the second you are studying the length conditional on it lasting 10 or more years. This is what the enter option of stset handles.

          Right censoring means that you don't know how long a relationship lasted, but know it is more than a given value. A relationship starting in 2015 and still going on in 2017 will last at least two years. This is what the failure indicator of stset takes care of. Left censoring would mean that you don't know the exact survival time, but you know that it is less than a value, it is not handled by stset but it is not your case. (The new stintreg command in Stata 15 handles left, right and interval censoring in parametric models.)

          Hopefully you have month and year when relationships start and end, so you can get a better handle on duration. Just create a variable with the duration at the start of observation (dur0), another with the duration at the end of observation (dur), and an indicator of whether or not the relationship was over at the end of observation (ended). Then

          Code:
          stset dur, failure(ended) enter(dur0)
          You can then fit parametric or non-parametric survival models that take into account left-truncation and right-censoring.

          Comment


          • #6
            Originally posted by German Rodriguez View Post
            A note on terminology: relationships that started before the observation window are left truncated, not left censored. For these you need to know how long the relationship had lasted when you started observation. Consider two relationships that started in 2000 and lasted 6 and 12 years respectively. The first one never makes it to your observation window. For the second you are studying the length conditional on it lasting 10 or more years. This is what the enter option of stset handles.

            Right censoring means that you don't know how long a relationship lasted, but know it is more than a given value. A relationship starting in 2015 and still going on in 2017 will last at least two years. This is what the failure indicator of stset takes care of. Left censoring would mean that you don't know the exact survival time, but you know that it is less than a value, it is not handled by stset but it is not your case. (The new stintreg command in Stata 15 handles left, right and interval censoring in parametric models.)

            Hopefully you have month and year when relationships start and end, so you can get a better handle on duration. Just create a variable with the duration at the start of observation (dur0), another with the duration at the end of observation (dur), and an indicator of whether or not the relationship was over at the end of observation (ended). Then

            Code:
            stset dur, failure(ended) enter(dur0)
            You can then fit parametric or non-parametric survival models that take into account left-truncation and right-censoring.
            Thanks for your reply, German.

            So, does it mean that Stata takes care of right censoring automatically when I state faliure variable?

            I have modified the way I told Stata that the data was survival data. Now my stset command looks like this:

            Code:
            stset end, id(firm_cust_n) time0(start) origin(time start) scale(30) failure(died) exit(time .)
            Code:
            end
            - End date of event
            Code:
            start
            - Beginning date of event
            Code:
            died
            - Failure occured (1 and 0)
            Code:
            exit(time .)
            - This specification makes sure that multiple failure events of an individual are considered. Otherwise Stata stops including the observations after the first failure.

            Comment


            • #7
              My advice was for single failure events. If an individual has multiple relationships I would treat each one as a separate survival time and introduce a shared frailty component to allow for correlation. If what you have in mind is a relationship stopping and then resuming, I would treat the second instance as a separate survival time, and would probably keep the duration clock running, i.e. one episode starts at 0 and ends at t1, then the second episode starts (resumes) at t2 > t1 and may end at t2.

              Comment

              Working...
              X