assigning a random switch date of switchers to non-switchers

Christine Baumgartner

Join Date: Jun 2024

Posts: 2
#1

assigning a random switch date of switchers to non-switchers

16 Jun 2024, 16:55

Hi everyone,

With data from a prospective cohort study, I want to look at patients taking 3 drugs. Among those, I would like to compare characteristics and outcomes of those who remained on 3 drugs during follow-up (nonswitchers), and those who switched to a simpler regimen with 2 drugs during follow-up (switchers). The "index date" (or baseline date) of switchers will be defined as the date of switching (switch_date) to the 2 drug regimen. In order to define an index date for non-switchers, I would like to randomly assign a switch date of switchers to non-switchers. There are more switchers than non-switchers in the dataset. Can anyone help me how to do this? The problem is that the follow-up durations and times of the non-switchers vary, and by randomly assigning a switch date of switchers to non-switchers results in some "index dates" that are outside of the actual follow-up times of the non-switchers.

This is how I tried:

First, I generated a dataset of non-switchers, including a unique id (variable id), the start of follow-up (i.e., the date when the patient started using 3 drugs, variable firstmoddate), and the end of follow-up (variable lastenddate).

Code:

* Example generated by -dataex-. For more info, type help dataex clear input double id float(firstmoddate lastenddate) 10184 19663 23139 10190 19663 23260 10358 19663 23152 10366 19663 23209 10405 19663 21171 10435 19663 23111 10468 19663 23160 10555 19663 23195 10556 19663 23230 10568 19663 20044 end format %td firstmoddate format %td lastenddate

I then assigned a random number (variable rand) to every id, and I saved this dateset as nonswitchers.dta:

Code:

set seed 20240516 generate rand=runiformint(0,977) save "nonswitchers.dta", replace

Second, I extracted all switch dates of the switchers (n=977) to a separate dataset.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float switchdate 20346 21110 20835 21717 19722 19705 20922 21257 20516 23112 end format %td switchdate

I then assigned a number from 1 to 977 (variable rand), and merged the dataset contaning the switch dates with the dataset of the non-switchers based on the variable rand:

Code:

gen rand=_n merge 1:m rand using "nonswitchers.dta" keep if _merge==3

The problem is that more then 30% of the switch date that were randomly assigned to the non-switchers are outside of the follow-up time (i.e. before firstmoddate or after lastenddate). Is there a way to randomly assign a switch date of switchers to non-switchers that is within the follow-up time of the non-switchers?

I use Stata version 16.1 on Windows.

Thanks!

Christine
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

16 Jun 2024, 18:05

The trick is to first match up each switch date with every non_switchers follow-up interval that contains it, and then foreach non-switcher, select a switch date at random just from among the matches. For this, Robert Picard's -rangejoin- command is the perfect tool. It is available from SSC. To use it, you must also install -rangestat-, by Robert Picard, Nick Cox, and Roberto Ferrer, also available from SSC. Here's how it works in your example data:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input double id float(firstmoddate lastenddate) 10184 19663 23139 10190 19663 23260 10358 19663 23152 10366 19663 23209 10405 19663 21171 10435 19663 23111 10468 19663 23160 10555 19663 23195 10556 19663 23230 10568 19663 20044 end format %td firstmoddate format %td lastenddate tempfile non_switchers save `non_switchers' * Example generated by -dataex-. For more info, type help dataex clear input float switchdate 20346 21110 20835 21717 19722 19705 20922 21257 20516 23112 end format %td switchdate tempfile switchdates save `switchdates' use `non_switchers', clear rangejoin switchdate firstmoddate lastenddate using `switchdates' set seed 20240516 gen double shuffle = runiform() by id (shuffle), sort: keep if _n == 1 rename switchdate index_date

Notes:
1. If there is some non-switcher whose follow-up interval does not contain any possible switch date, this code will assign missing value to that non-switcher's index date.
2. I saved your example data in tempfile's, but that is just for my convenience in this context. You can use your actual permanent data sets for this purpose.
1 like
Comment
Christine Baumgartner

Join Date: Jun 2024

Posts: 2
#3

17 Jun 2024, 06:04

Dear Clyde,

Perfect, thank you very much, it worked!

Christine
Comment

Announcement

assigning a random switch date of switchers to non-switchers

Comment

Comment