Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Working with Dates in Panel Data

    I'm trying to compute and incidence case based on test_1. So an incident case will be a positive test_1 and if and ID has multiple positive test_1, they only need to be counted as incident case if there's at least a 90 days period between the last positive. See the attached sample data.

    Thanks,

    David

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(ID date) str8(test_1 test_2)
    1000 22281 "negative" "negative"
    1000 22295 "positive" "positive"
    1000 22310 "negative" "positive"
    1000 22326 "negative" "positive"
    1000 22339 "negative" "positive"
    1000 22353 "negative" "positive"
    1000 22368 "positive" "positive"
    1000 22380 "negative" "negative"
    1000 22400 "negative" "negative"
    1000 22385 "negative" "negative"
    1000 22400 "positive" "positive"
    1000 22415 "positive" "positive"
    1001 22281 "negative" "positive"
    1001 22295 "negative" "positive"
    1001 22310 "negative" "negative"
    1001 22326 "negative" "negative"
    1001 22339 "negative" "negative"
    1001 22353 "negative" "negative"
    1001 22368 "negative" "negative"
    1004 22281 "negative" "negative"
    1004 22295 "negative" "negative"
    1004 22310 "negative" "negative"
    1004 22326 "negative" "negative"
    1004 22339 "negative" "negative"
    1004 22353 "negative" "negative"
    1004 22368 "negative" "negative"
    1004 22380 "negative" "negative"
    1004 22400 "negative" "negative"
    1004 22385 "negative" "negative"
    1004 22400 "negative" "negative"
    1005 22281 "positive" "positive"
    1005 22295 "negative" "positive"
    1005 22310 "negative" "positive"
    1005 22326 "negative" "positive"
    1005 22339 "negative" "positive"
    1005 22353 "negative" "positive"
    1005 22368 "negative" "positive"
    1005 22380 "negative" "positive"
    1010 22281 "positive" "negative"
    1010 22295 "positive" "positive"
    1010 22310 "positive" "positive"
    1010 22326 "negative" "positive"
    1010 22339 "negative" "positive"
    1010 22353 "negative" "positive"
    1010 22368 "negative" "positive"
    1010 22380 "negative" "positive"
    1010 22400 "negative" "positive"
    1010 22385 "negative" "positive"
    1010 22400 "positive" "positive"
    1010 22415 "negative" "positive"
    1010 22430 "negative" "positive"
    end
    format %tdnn/dd/CCYY date

  • #2
    It is not clear what you want your output to look like. Do you just want to tag observations for which test1=="positive" and the last test1=="positive" was within 90 days? If so:

    Code:
    bys ID (test_1 date): gen wanted= test_1=="positive" & test_1[_n-1]== "positive" & (date-date[_n-1]<=90)
    Res.:

    Code:
    
    . sort ID date
    
    . list, sepby(ID)
    
         +-------------------------------------------------+
         |   ID        date     test_1     test_2   wanted |
         |-------------------------------------------------|
      1. | 1000    1/1/2021   negative   negative        0 |
      2. | 1000   1/15/2021   positive   positive        0 |
      3. | 1000   1/30/2021   negative   positive        0 |
      4. | 1000   2/15/2021   negative   positive        0 |
      5. | 1000   2/28/2021   negative   positive        0 |
      6. | 1000   3/14/2021   negative   positive        0 |
      7. | 1000   3/29/2021   positive   positive        1 |
      8. | 1000   4/10/2021   negative   negative        0 |
      9. | 1000   4/15/2021   negative   negative        0 |
     10. | 1000   4/30/2021   negative   negative        0 |
     11. | 1000   4/30/2021   positive   positive        1 |
     12. | 1000   5/15/2021   positive   positive        1 |
         |-------------------------------------------------|
     13. | 1001    1/1/2021   negative   positive        0 |
     14. | 1001   1/15/2021   negative   positive        0 |
     15. | 1001   1/30/2021   negative   negative        0 |
     16. | 1001   2/15/2021   negative   negative        0 |
     17. | 1001   2/28/2021   negative   negative        0 |
     18. | 1001   3/14/2021   negative   negative        0 |
     19. | 1001   3/29/2021   negative   negative        0 |
         |-------------------------------------------------|
     20. | 1004    1/1/2021   negative   negative        0 |
     21. | 1004   1/15/2021   negative   negative        0 |
     22. | 1004   1/30/2021   negative   negative        0 |
     23. | 1004   2/15/2021   negative   negative        0 |
     24. | 1004   2/28/2021   negative   negative        0 |
     25. | 1004   3/14/2021   negative   negative        0 |
     26. | 1004   3/29/2021   negative   negative        0 |
     27. | 1004   4/10/2021   negative   negative        0 |
     28. | 1004   4/15/2021   negative   negative        0 |
     29. | 1004   4/30/2021   negative   negative        0 |
     30. | 1004   4/30/2021   negative   negative        0 |
         |-------------------------------------------------|
     31. | 1005    1/1/2021   positive   positive        0 |
     32. | 1005   1/15/2021   negative   positive        0 |
     33. | 1005   1/30/2021   negative   positive        0 |
     34. | 1005   2/15/2021   negative   positive        0 |
     35. | 1005   2/28/2021   negative   positive        0 |
     36. | 1005   3/14/2021   negative   positive        0 |
     37. | 1005   3/29/2021   negative   positive        0 |
     38. | 1005   4/10/2021   negative   positive        0 |
         |-------------------------------------------------|
     39. | 1010    1/1/2021   positive   negative        0 |
     40. | 1010   1/15/2021   positive   positive        1 |
     41. | 1010   1/30/2021   positive   positive        1 |
     42. | 1010   2/15/2021   negative   positive        0 |
     43. | 1010   2/28/2021   negative   positive        0 |
     44. | 1010   3/14/2021   negative   positive        0 |
     45. | 1010   3/29/2021   negative   positive        0 |
     46. | 1010   4/10/2021   negative   positive        0 |
     47. | 1010   4/15/2021   negative   positive        0 |
     48. | 1010   4/30/2021   negative   positive        0 |
     49. | 1010   4/30/2021   positive   positive        1 |
     50. | 1010   5/15/2021   negative   positive        0 |
     51. | 1010   5/30/2021   negative   positive        0 |
         +-------------------------------------------------+
    
    .
    Last edited by Andrew Musau; 10 Feb 2022, 13:56.

    Comment


    • #3
      Hey Andrew, So what I want is to create a variable for incident cases. So if an ID has only one positive test, they count as incident. But if they have multiple positives, they could only be counted as incident if the time between their previous positive and the current positive is >= 90 days. So every row for that variable should have a 1 for an incident case and a 0 if otherwise. So the code you provided doesn't see to pic incident cases. For example for ID 1000 he had in incident case 1/15/2021 and another on 4/30/2021. The positive tests on 3/29/2021 and 5/15/2021 should not count and incident cases.

      Thanks,

      David

      Comment


      • #4
        Code:
        by ID (date), sort: gen ref_date = cond(test_1 == "positive", date, .) if _n == 1
        by ID (date): replace ref_date = cond(test_1 == "positive" & ///
            (date > ref_date[_n-1] + 90 | missing(ref_date[_n-1])), date, ref_date[_n-1]) ///
            if _n > 1
        format %td ref_date
        
        gen byte incident_case = test_1 == "positive" & date == ref_date
        I'm a little confused by your post, but I think the above is what you want. Am I correct that the variable test_2 is irrelevant here. Also, there are a few observations that are complete duplicates, and some others that have the same id and date, but different results for test_1 or test_2. I'm not sure if those are data errors or represent something else. The main relevance of this here is that if you have a situation where the same id has two positive test_1's on the same date, they will both be considered incident cases (if they are the first positives or have had no positives within 90 days before) or both not. There is no clear way I can see to designate one vs the other as "the" incident case in this circumstance.

        Comment


        • #5
          Another way, but as Clyde remarks, you should not have duplicates for ID test_1 date for positive cases.

          Code:
          bys ID test_1 date: assert _N==1 if test_1== "positive"
          bys ID test_1 (date): gen wanted= date- date[1] if test_1=="positive"
          bys ID test_1 (date): replace wanted= . if inrange(wanted, 1, 90) | (wanted- wanted[_n-1]<90 & !missing(wanted) & !missing(wanted[_n-1]))
          replace wanted=!missing(wanted)
          Last edited by Andrew Musau; 10 Feb 2022, 16:17.

          Comment


          • #6
            Thanks Clyde. This works for me. This was a sample data I generated not the actual data I'm working with. Disregarding the duplicates and test_2 was irrelevant.

            Thanks,

            David

            Comment


            • #7
              Thanks Andrew and apologies for not being clear at first. Your code now works.

              David

              Comment

              Working...
              X