Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Survival Analysis: how to define the censor variable?

    Hello everyone,
    I am not that familiar yet with survival analysis, however, I want to investigate peaceagreements and their durability.
    I have a dataset indicating how long a peace agreement lasted. I have a start date (the signing date of the peaceagreement) and an end date (when the peaceagreement broke up).

    To declare the data as survival data, I have to define the failure (stset- command)
    Now is my question - how do I define the censor variable?

    Thany you,
    There Scho


  • #2
    Hopefully you have data on all peace agreements, not just the ones that broke up. The peace agreements that are still in place today, or whenever the last date of data collection was, will be the observations with censored survival times. Your data will look something like this.

    Code:
    id  startdat enddat fail
    1  23oct1993 07jun2020 0
    2  25dec1994 01jan1995 1
    3  19aug2012 28feb2010 1
    4  16jul2019 07jun2020 0
    Your -stset- will then be

    Code:
    stset enddat, fail(fail) enter(startdat) scale(365.24)
    The scale() option is only if you want to rescale the time units from days to years.

    If you only have data on the peace agreements that broke up, then the code is the same (except you will only have two of the 4 observations above). In that case you can actually exclude the fail variable and the failure() option from -stset- (Stata will assume that all observations fail).

    Be aware, that if not all peace agreements break up and you only have data on the ones that do then you need to be careful. Disclaimer: I'm a medical researcher and know nothing about peace agreements or their statistical analysis. When you only have information on the observations that failed, your data are known as "right truncated" and a standard survival analysis can be biased. Classic examples are studies that claimed to show right-handers live longer than left-handers. The studies were based on death records (there were only data on those who died) and a "standard" analysis gave very misleading results.


    Comment


    • #3
      Thank you very much!
      What if there are missings, in my data on enddate or startdate? How should I handle it?

      Comment


      • #4
        I have done it that way, however, I am not sure how to handle the output of the stset-command (attached document).
        What does "Probable Error" mean?
        Attached Files

        Comment


        • #5
          Is it possible to show us an example of some of your data? It's hard to identify the exact problem without more info.

          Is it possible that you have 127 (or 128) observations of peace agreements that ended but you haven't specified the exit date for those that didn't end?

          Regarding missing data on start or end dates, the approach depends on the extent and reasons for the missingness. Any approach will require assumptions and the appropriateness of assumptions will depend on the reason for missingness.

          Comment


          • #6
            The data is the UCDP Peace Agreement Dataset: https://ucdp.uu.se/downloads/index.html#peaceagreement

            Here is some of my data:
            duration is the original variable, that declares the enddate (it was a string variable and I transformed it into a date variable, called enddate)
            pa_date is the original variable when the peaceagreement was signed (it was a string variable as well and I transformed it into a date variable, called signdate)
            ended is my censored- variable

            UcdpAgr UcdpCon year pa_date ended duration enddate signdate
            24 11345 2015 2015-01-21 1 2016-07-08 08jul2016 21jan2015
            28 11345 2015 2015-02-01 1 2016-07-08 08jul2016 01feb2015
            29 11345 2015 2015-08-17 1 2016-07-08 08jul2016 17aug2015
            31 11345 2012 2012-02-27 0 27feb2012
            1480 11345 2014 2014-05-09 0 09may2014
            1482 11345 2014 2014-05-09 0 09may2014
            1484 11345 2014 2014-10-20 1 2016-07-08 08jul2016 20oct2014
            1625 11345 2018 2018-09-12 0 12sep2018
            1444 11348 2012 2012-09-27 0 27sep2012
            1456 11348 2012 2012-09-27 0 27sep2012
            1457 11348 2012 2012-09-27 0 27sep2012
            1531 11348 2013 2013-03-12 0 12mar2013
            1536 13246 2014 2014-09-05 1 2014-11-03 03nov2014 05sep2014
            1537 13306 2015 2015-02-12 0 12feb2015
            1058 209 1995 1995-10-13 0 13oct1995
            1512 218 1999 1999-02-21 0 21feb1999
            1513 218 1999 1999-02-21 0 21feb1999
            1613 221 2011 2011-12-11 0 11dec2011
            1485 221, 222, 264 2015 2015-10-15 0 15oct2015
            1611 221, 264 2015 2015-02-12 0 12feb2015


            Thank You!

            Comment


            • #7
              Here is the data again:
              Attached Files

              Comment


              • #8
                You need an "end date" for the observations where the agreements are still in place (the censored observations). That is, the last date where it was known that the peace agreement was still in place. Use, for example, the closing date for the data collection.

                Comment


                • #9
                  I have another question: I want to reshape my data in a long format. I found the command 'tsfill' for time-varying variables. The data is the same as listed above, however, it does not work because Stata wants me to declare the data by tsset, however, I need to declare it with stset for my survival analysis.
                  Is there another way to declare the data as 'long'?

                  Thank you!

                  Comment


                  • #10
                    I'm not familiar with tsfill, but one possibility is to set your data using tsset to reshape the data. You can then declare stset afterward. Have you tried using the reshape command? Type: 'help reshape' in Stata's command window. You should be able to shape from wide to long. You can then stset.

                    Comment


                    • #11
                      Thank you. Which variables have to be after tsset? The Identifier variables?

                      Comment

                      Working...
                      X