Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding combination treatments in panel data

    Hello,

    I have panel data where each row represents a different prescription for a patient. A simplified example is below (end_supply = start_supply + prescription coverage). I need to identify all combination treatments that occur during overlapping time periods, and eventually I would also need to identify treatment lines (where patients may switch to another combo to another). But my first priority would be to identify the combination treatments and I am currently stumped. Is there an easy way to do this? Thank you in advance!


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(patient_id pres_coverage prescript_date) str17 treatment_name float(start_supply end_supply)
     1   29.85 21422 "Retinoids"    21422 21522
     1       5 21422 "Acitretin"    21422 21522
     1      30 21429 "Acitretin"    21429 21529
     1     100 21462 "Retinoids"    21462 21562
     1      30 21476 "Acitretin"    21476 21576
     1  16.667 21489 "Retinoids"    21489 21589
     1      30 21489 "Acitretin"    21489 21589
     2     100 21344 "Retinoids"    21344 21444
     2     100 21344 "Urea"         21344 21444
     2     100 21400 "Retinoids"    21400 21500
     2     100 21400 "Urea"         21400 21500
     2       7 21449 "Antibiotics"  21449 21549
     2     100 21656 "Retinoids"    21656 21756
     2  28.039 21656 "Adalimumab"   21656 21756
     3     100 21103 "Retinoids"    21103 21203
     3     100 21114 "Retinoids"    21114 21214
     3     100 21141 "Retinoids"    21141 21241
     3     100 21159 "Retinoids"    21159 21259
     3      15 21169 "Acitretin"    21169 21269
     3     100 21169 "Retinoids"    21169 21269
     3     100 21203 "Retinoids"    21203 21303
     4     100 21180 "Retinoids"    21180 21280
     4      40 21180 "Retinoids"    21180 21280
     4       5 21180 "Antibiotics"  21180 21280
     4     100 21264 "Retinoids"    21264 21364
     4      40 21264 "Retinoids"    21264 21364
     4     100 21316 "Retinoids"    21316 21416
     4  53.333 21316 "Retinoids"    21316 21416
     5     100 20924 "Retinoids"    20924 21024
     5      12 20936 "Antibiotics"  20936 21036
     5      40 20984 "Retinoids"    20984 21084
     5  66.667 20984 "Retinoids"    20984 21084
     5  66.667 21015 "Retinoids"    21015 21115
     5  93.545 21033 "Retinoids"    21033 21133
     5     100 21048 "Retinoids"    21048 21148
     6      25 21568 "Retinoids"    21568 21668
     6      15 21637 "Acitretin"    21637 21737
     6      50 21652 "Retinoids"    21652 21752
     6      50 21652 "Retinoids"    21652 21752
     6      50 21673 "Retinoids"    21673 21773
     6      50 21701 "Retinoids"    21701 21801
     6      50 21754 "Retinoids"    21754 21854
     7      20 21343 "Ciclosporin"  21343 21443
     7     100 21379 "Retinoids"    21379 21479
     7      20 21379 "Ciclosporin"  21379 21479
     7      20 21439 "Ciclosporin"  21439 21539
     7      20 21491 "Ciclosporin"  21491 21591
     7     100 21491 "Retinoids"    21491 21591
     7     100 21631 "Retinoids"    21631 21731
     8     200 21436 "Retinoids"    21436 21536
     8     100 21436 "Pimecrolimus" 21436 21536
     8     100 21436 "Urea"         21436 21536
     8      50 21468 "Retinoids"    21468 21568
     8      25 21468 "Urea"         21468 21568
     8     200 21581 "Retinoids"    21581 21681
     8     200 21600 "Retinoids"    21600 21700
     9     100 21575 "Retinoids"    21575 21675
     9   6.667 21604 "Antibiotics"  21604 21704
     9  28.039 21926 "Adalimumab"   21926 22026
     9  84.116 21929 "Adalimumab"   21929 22029
     9  84.116 22090 "Adalimumab"   22090 22190
     9      15 22193 "Acitretin"    22193 22293
     9  84.116 22212 "Adalimumab"   22212 22312
    10  66.667 21447 "Retinoids"    21447 21547
    10  33.333 21447 "Urea"         21447 21547
    10  33.333 21523 "Urea"         21523 21623
    10  28.571 21526 "Adalimumab"   21526 21626
    10  85.714 21532 "Adalimumab"   21532 21632
    10  84.116 21624 "Adalimumab"   21624 21724
    10      50 21709 "Pimecrolimus" 21709 21809
    end
    format %td prescript_date
    format %td start_supply
    format %td end_supply


  • #2
    I don't understand. You say that end_supply = start_supply + pres_coverage, but that is clearly not true in the example data you show; it's not even close to true.

    Furthermore, I'm not sure I understand what you want the final result to look like. Do you want to re-organize the data so that each patient has one or more observations, and each observation consists of a period of time during which he or she has been prescribed an unchanging set of medicines, a new observation being created each time a medicine is either added or discontinued, along with the begin and end dates of those intervals? And if that's the idea, do you want a separate variable for each medicine (i.e. med1, med2, med3, ... however many are needed) or do you want a single variable that lists the medicines combined (e.g. a value like "Retinoids, Acitretin")?

    Also, in your example data, the variable prescript_date is always the same as start_supply. If this is true in your full data set, what is the point of having both variables? And if it is not true, what role does prescript_date play in this calculation, if any?
    Last edited by Clyde Schechter; 21 Feb 2022, 12:20.

    Comment


    • #3
      Hi Clyde,

      I must apologize - to keep things more anonymized, I tried to edit the dates from the original dates, but did so incorrectly. Please see the updated example dataset below with the end_supply dates that make sense. I also removed prescript_date as it is the same as start_supply and therefore both variables are not necessary. I also included less patients in the example but more observations per patient to be truer to the original dataset.

      For the final result, I believe it would be best to have the first option you described: to re-organize the data so that each patient has one or more observations, and each observation consists of a period of time during which he or she has been prescribed an unchanging set of medicines, a new observation being created each time a medicine is either added or discontinued, along with the begin and end dates of those intervals. A new variable for each medicine would be useful in the end for reporting purposes.

      Thank you for your patience! Any help would be greatly appreciated.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(patient_id pres_coverage) str17 treatment_name float(start_supply end_supply)
      1  29.85 "Retinoids"    21322  21351.85
      1      5 "Acitretin"    21322     21327
      1     30 "Acitretin"    21329     21359
      1    100 "Retinoids"    21362     21462
      1     30 "Acitretin"    21376     21406
      1 16.667 "Retinoids"    21389  21405.67
      1     30 "Acitretin"    21389     21419
      1 16.667 "Retinoids"    21416  21432.67
      1     30 "Acitretin"    21423     21453
      1     50 "Retinoids"    21444     21494
      1     50 "Retinoids"    21455     21505
      1    100 "Retinoids"    21468     21568
      1     50 "Retinoids"    21468     21518
      1    100 "Retinoids"    21487     21587
      1    100 "Retinoids"    21504     21604
      1    100 "Retinoids"    21525     21625
      1    100 "Retinoids"    21535     21635
      1    100 "Retinoids"    21574     21674
      1     15 "Acitretin"    21574     21589
      1    100 "Retinoids"    21601     21701
      2    100 "Urea"         21244     21344
      2    100 "Retinoids"    21244     21344
      2    100 "Urea"         21300     21400
      2    100 "Retinoids"    21300     21400
      2      7 "Antibiotics"  21349     21356
      2 28.039 "Adalimumab"   21556  21584.04
      2    100 "Retinoids"    21556     21656
      2 84.116 "Adalimumab"   21571 21655.115
      2 84.116 "Adalimumab"   21662 21746.115
      2    100 "Retinoids"    21662     21762
      2     20 "Urea"         21764     21784
      2     50 "Retinoids"    21764     21814
      2    250 "Retinoids"    21781     22031
      2     25 "Retinoids"    21781     21806
      2 84.116 "Adalimumab"   21802 21886.115
      2 84.116 "Adalimumab"   21882 21966.115
      2     15 "Pimecrolimus" 21882     21897
      3    100 "Retinoids"    21003     21103
      3    100 "Retinoids"    21014     21114
      3    100 "Retinoids"    21041     21141
      3    100 "Retinoids"    21059     21159
      3     15 "Acitretin"    21069     21084
      3    100 "Retinoids"    21069     21169
      3    100 "Retinoids"    21103     21203
      3     15 "Acitretin"    21118     21133
      3     10 "Antibiotics"  21118     21128
      3     15 "Acitretin"    21197     21212
      3 28.571 "Adalimumab"   21255  21283.57
      3     15 "Acitretin"    21292     21307
      3    200 "Retinoids"    21595     21795
      3    100 "Retinoids"    21612     21712
      4    100 "Retinoids"    21080     21180
      4     40 "Retinoids"    21080     21120
      4      5 "Antibiotics"  21080     21085
      4    100 "Retinoids"    21164     21264
      4     40 "Retinoids"    21164     21204
      4    100 "Retinoids"    21216     21316
      4 53.333 "Retinoids"    21216  21269.33
      4     50 "Retinoids"    21332     21382
      4    100 "Retinoids"    21333     21433
      4    100 "Retinoids"    21333     21433
      4    7.5 "Antibiotics"  21364   21371.5
      4 28.571 "Adalimumab"   21410  21438.57
      4    100 "Retinoids"    21424     21524
      4 28.571 "Adalimumab"   21424  21452.57
      4 28.571 "Adalimumab"   21454  21482.57
      4 85.714 "Adalimumab"   21475 21560.715
      4    100 "Retinoids"    21531     21631
      4     50 "Retinoids"    21532     21582
      4    100 "Retinoids"    21563     21663
      4 84.116 "Adalimumab"   21563 21647.115
      5    100 "Retinoids"    20824     20924
      5     12 "Antibiotics"  20836     20848
      5     40 "Retinoids"    20884     20924
      5 66.667 "Retinoids"    20884  20950.67
      5 66.667 "Retinoids"    20915  20981.67
      5 93.545 "Retinoids"    20933 21026.545
      5    100 "Retinoids"    20948     21048
      5    100 "Retinoids"    20956     21056
      5    100 "Retinoids"    21038     21138
      5    100 "Retinoids"    21048     21148
      5    100 "Retinoids"    21090     21190
      5    100 "Retinoids"    21118     21218
      5    100 "Retinoids"    21153     21253
      5 28.571 "Adalimumab"   21166  21194.57
      5 85.714 "Adalimumab"   21167 21252.715
      5 85.714 "Adalimumab"   21230 21315.715
      5    100 "Retinoids"    21256     21356
      5 85.714 "Adalimumab"   21321 21406.715
      5    100 "Retinoids"    21321     21421
      5 85.714 "Adalimumab"   21412 21497.715
      6     25 "Retinoids"    21468     21493
      6     15 "Acitretin"    21537     21552
      6     50 "Retinoids"    21552     21602
      6     50 "Retinoids"    21552     21602
      6     50 "Retinoids"    21573     21623
      6     50 "Retinoids"    21601     21651
      6     50 "Retinoids"    21654     21704
      6 28.039 "Adalimumab"   21714  21742.04
      6     20 "Urea"         21714     21734
      end
      format %td start_supply
      format %td end_supply

      Comment


      • #4
        So I believe this does it:
        Code:
        replace end_supply = round(end_supply)
        assert end_supply >= start_supply
        
        //  FIRST IDENTIFY CONTINUING SPELLS OF USE OF ANY DRUG
        gen long obs_no = _n
        reshape long @_supply, i(obs_no) j(event) string
        rename _supply date
        gsort patient_id treatment_name date -event
        
        by patient_id treatment_name: gen n_rx = sum((event == "start") - (event == "end"))
        by patient_id treatment_name: gen spell_num = sum(n_rx == 1 & inlist(n_rx[_n-1], 0, .))
        sort patient_id treatment_name spell_num, stable
        by patient_id treatment_name spell_num: keep if inlist(_n, 1, _N)
        by patient_id treatment_name spell_num: assert event[1] == "start" & event[_N] == "end"
        
        //  NOW AGGREGATE TO CONTINUOUS SPELLS OF CONSTANT REGIMEN
        drop spell_num n_rx
        // gsort patient_id date -event
        // by patient_id (date): assert event[1] == "start"
        // by patient_id (date): gen regimen = treatment_name if _n == 1
        // by patient_id (date): replace regimen = cond(event == "start", ///
        //     regimen[_n-1] + "," + treatment_name, ///
        //     subinstr(regimen[_n-1], treatment_name, "", .)) if _n == 3
            
        capture program drop one_patient
        program define one_patient
            gsort date -event
            gen regimen = treatment_name in 1
            replace regimen = cond(event == "start", ///
            regimen[_n-1] + " " + treatment_name, ///
            subinstr(regimen[_n-1], treatment_name, "", .)) in 2/L
            exit
        end
        
        runby one_patient, by(patient_id)
        replace regimen = trim(itrim(regimen))
        
        sort patient_id date, stable
        by patient_id date: keep if _n == _N
        by patient_id (date): gen to = date[_n+1] - 1, after(date)
        rename date from
        format to %td
        keep patient_id from to regimen
        split regimen, gen(med) parse(" ")
        drop regimen
        -runby- is written by Robert Picard and me, and is available from SSC.

        Note: This code relies crucially (without verification) on the assumption that treatment names never contain embedded blanks. If that is not true of the whole data set, or if you are not sure whether it is true or not, run -replace treatment_name = subinstr(treatment_name, " ", "_", .)- before using this code. If need be, you can reverse that on each of the newly created med* variables after the code above has run.

        Comment


        • #5
          Thank you so much Clyde! This solution worked really well!

          Comment

          Working...
          X