Randomisation using permuted blocks of varying size

Laura Myles

Join Date: Jun 2018

Posts: 153
#1

Randomisation using permuted blocks of varying size

10 Dec 2020, 11:35

Hi Listers,

I would like to create a randomisation list for a 2-arm study. I would like the randomisation to use permuted blocks of varying size (8, 10 and 12). The code is for fixed block size of 8 so I was hoping someone could share how to modify it to have varying block sizes...

clear
set obs 160
egen arm = seq(), to(2)
egen block = seq(), block(8)
set seed 314159
gen random = uniform()
bysort block (random): gen byte seqn = _n
bysort block (seqn): l seqn arm

I am aware of the -ralloc- command but I would like to modify code I found online to be able to understand what is done. I also find it surprising when using ralloc that even if I specify the number of participants to be allocated to be 340, it creates a list of 346 - can this be fixed?

ralloc blknum blksiz Rx, ns(340) ntr(2) init(8) osiz(3) sav(mylist) idvar(ID) tab seed(314159)

Last edited by Laura Myles; 10 Dec 2020, 12:12.
Tags: None
Daniel Schaefer

Join Date: Mar 2020

Posts: 810
#2

10 Dec 2020, 12:30

Hello Laura,

There are a number of different ways one could vary block sizes. For example, one could complete a block, then select the next block size randomly from the given range. Alternatively one could complete a block of say 6, then move on to 7, then move on to 8, and so on. There are other examples, and the implementation likely depends on why you are motivated to vary block sizes in the first place.

Edit: So I should ask; what exactly did you have in mind?
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#3

10 Dec 2020, 12:40

Hi Daniel,

Thanks for your reply. I was advised that to use block randomisation to ensure good balance in randomisation and to use varying block sizes to avoid predictability. So as I have 2 conditions to randomise participants, I would like block size to be 8, 10 and 12. The example code I have only uses block size = 8 so I was hoping I could modify the code above. Any advice?
Comment
Brad Anderson

Join Date: Sep 2014

Posts: 70
#4

10 Dec 2020, 12:48

Originally posted by Laura Myles View Post

Hi Listers,

I would like to create a randomisation list for a 2-arm study. I would like the randomisation to use permuted blocks of varying size (8, 10 and 12). The code is for fixed block size of 8 so I was hoping someone could share how to modify it to have varying block sizes...

clear
set obs 160
egen arm = seq(), to(2)
egen block = seq(), block(8)
set seed 314159
gen random = uniform()
bysort block (random): gen byte seqn = _n
bysort block (seqn): l seqn arm

I am aware of the -ralloc- command but I would like to modify code I found online to be able to understand what is done. I also find it surprising when using ralloc that even if I specify the number of participants to be allocated to be 340, it creates a list of 346 - can this be fixed?

ralloc blknum blksiz Rx, ns(340) ntr(2) init(8) osiz(3) sav(mylist) idvar(ID) tab seed(314159)

Curios why it's a problem to have a list of 346? Your smallest block size was 8 so before the last block there might have been 338 random assignments and ralloc will fill out the last block to exceed the specified n. If you reach your target n you just won't use the randomizations for 341-346.
Comment

Daniel Schaefer

Join Date: Mar 2020
Posts: 810

10 Dec 2020, 14:03

Laura Myles

The short answer is that I checked out the seq and other pattern generation egen functions and it doesn't look like there is anything there. The long answer is ugly, but it works;

Code:

clear
set obs 160
* Get random numbers first.
set seed 314159
gen random = uniform()
egen arm = seq(), to(2)
gen block = .
gen block_size = ceil(random * 3) /* random number between one and three */
replace block_size = block_size * 2 /* random values 2, 4, 6 */
replace block_size = block_size + 6 /* random values 8, 10, 12 */
scalar current_block_size = block_size[1]
scalar block_number = 1
scalar block_itter = 1
forv index = 1(1)160 {
    if (block_itter > current_block_size) {
        capture quietly drop current_block_size
        scalar current_block_size = block_size[`index']
        scalar block_itter = 1
        scalar block_number = block_number + 1
    }
    replace block = block_number in `index'
    scalar block_itter = block_itter + 1
}
bysort block (random): gen byte seqn = _n
bysort block (seqn): l seqn arm

Comment

Laura Myles

Join Date: Jun 2018

Posts: 153
#6

11 Dec 2020, 03:06

Daniel Schaefer

Thanks for this, it works perfectly!

Can I double-check what the process in the {} brackets is doing... It is determining which block each observation falls under. It reads the first cell in block_size (e.g. 12) and allocates the first 12 observations to block 1. For block 2, it reads the 13th cell of block_size to determine the size of the block (e.g. 8) and allocates the next 8 observations to block 2 and so on - is this correct?

By the end of the process, block_size does not capture the actual size of each block so I added this line: egen blk_size = max(seqn), by(block)
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#7

11 Dec 2020, 03:12

Brad Anderson

Hi Brad, I know you would normally randomise extra participants anyway for backup so it may not be too much of an issue.

The main issue (for me) is that when I specified 340 participants but 346 are randomised and then tab Rx for the 340 I am interested in, I find that 171 were allocated to condition 1 and 169 to condition 2. I suppose it is not such a big deal but somewhat annoying.

Do you have any thoughts on this?
Comment
Philip Ryan

Join Date: Jul 2016

Posts: 20
#8

11 Dec 2020, 05:09

Hello Laura

With your specification of the ralloc command it turns out that the final block size chosen is 8. The previous block (of size 12) ended at observation number 338. So, you now have 8 allocations to fill the final block, but only want the first 2. "Unfortunately" both those first two allocations in the block are to treatment A. And so, when (as Brad rightly suggests) you simply discard the final 6 allocations in the final block, you are left with a small surplus of treatment A. You are indeed correct that this "is not a big deal" in a trial of any reasonable size, the type of trial that ralloc and randomly perrmuted blocks is designed to address. What would be a big deal would be if one were so annoyed at the imbalance that one was tempted to deterministically alter the last 2 allocations to tidy them up. (I am sure that's not what you'd do!) The deletion of the final 6 observations does not change the probability model that ralloc sets up and which provides the basis of the statistical comparison between the two groups.

With 2 treatment groups, the only way to be sure that you will get exactly n/2 A and n/2 B is to (i) make sure your n is even, and (ii) specify all blocks be of size 2. Not very satisfactory. You have a small to moderate sized trial, and you have chosen 3 relatively large block sizes. I won't do the probability calculations, but it is almost certain that under those specifications you will get "extra" allocations and at least a reasonable chance that the allocations in the desired first part of the final block may be unbalanced. There are advantages in using large block sizes but there is a price, and on this occasion you have paid it. I think the price is very small, but I suppose for your size trial I would probably have chosen smaller block sizes. Of course you may have particular reasons for your block sizes.

Finally, there may well be other approaches (within Stata) to using randomly permuted blocks in trial designs, ones that may better suit your needs. Just take care that, especially if your trial is publicly funded (NIH, MRC, NHMRC etc) or overseen by a regulatory authority, the provenance and performance of the code has been well demonstrated.

Good luck with your trial!

Phil
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#9

11 Dec 2020, 06:23

Philip Ryan

Thank you for your reply. Would increasing the number of participants to be allocated help resolve this issue, for example increasing nsub to 600 or 800?
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#10

11 Dec 2020, 06:51

Philip Ryan

Thank you for your reply. Would increasing the number of participants to be allocated help resolve this issue, for example increasing nsub to 600 or 800?
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 810
#11

11 Dec 2020, 08:25

Laura Myles yes, it seems like you have correctly understood the logic of the forvalues loop. It may be wise to drop block_size after creating your blk_size variable, just so you or someone you are working with doesn't get these two variables confused.
Comment
Philip Ryan

Join Date: Jul 2016

Posts: 20
#12

11 Dec 2020, 14:17

Originally posted by Laura Myles View Post

Philip Ryan

Thank you for your reply. Would increasing the number of participants to be allocated help resolve this issue, for example increasing nsub to 600 or 800?

Yes and no. Yes, in that with a larger number of subjects allocated, any imbalance in an incomplete final block will matter less. No, in that an incomplete final block is still highly likely. Even if all your blocks were of size 2, stopping the recruitment of subjects on an odd number would lead to a (trivial) imbalance. ralloc was designed to deliver at least the number of subjects desired.

In the real world of clinical trials, the actual number of subjects allocated may be less than, equal to, or greater than what was planned. Your randomisation processes should take all this into account and the main way of doing this is to purposely over-allocate, and if you have strata, to do so for each stratum. Indeed, if one stratum is, say, "hospital", in a multicentre trial, then your randomisation schedule should allow for potential recruitment of additional hospitals and the subjects within that hospital. Minor imbalances in incomplete block(s) is really not an issue.

I appreciate we may be talking at cross purposes given your particular situation and goals.
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#13

14 Jan 2021, 04:28

Thanks all for your comments on this.

Philip Ryan in your last post do you suggest creating a longer list than the trial N to ensure I have enough allocations (over-allocate)?
Comment
Philip Ryan

Join Date: Jul 2016

Posts: 20
#14

14 Jan 2021, 15:57

Originally posted by Laura Myles View Post

Thanks all for your comments on this.

Philip Ryan in your last post do you suggest creating a longer list than the trial N to ensure I have enough allocations (over-allocate)?

Yes, that is my suggestion.

I routinely allocate more strata (*) than I think I will use, and more subjects within each stratum than I think I will use. It just makes the process of changing the recruitment strategy, should that be needed, much easier and transparent since every stratum (eg hospital) and every subject within a stratum has been allocated using the original randomization schedule, time and date stamped, and available to your trial oversight committee for scrutiny before the first subject is actually randomized. Contrast that with the potential for difficulties if you have to find extra allocations or strata once the trial has commenced.

There is no statistical or regulatory problem if you never use extra allocations.

(*) Of course it makes no sense to over-specify certain strata, for example sex.
Comment
Laura Myles

Join Date: Jun 2018

Posts: 153
#15

15 Jan 2021, 03:17

Philip Ryan Thanks again for such a comprehensive reply.
Comment

Announcement

Randomisation using permuted blocks of varying size

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment