Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sample size calculations in clinical trial

    Hi statalisters,

    I’m using data from a pilot RCT to determine the sample size necessary to find a desired effect sizes in a two-mean comparison. I want to know if I should be using the mean and standard deviation of the difference in differences of the variable of interest (difference between treatment and control between T1 and T2) to estimate desired sample size, or if I should use the mean and standard deviation of the baseline levels of the variable of interest for the total sample, or some alternative.

    All analysis is done in Stata 13 (I believe the command “power” does not work in earlier versions of Stata). For example using blood pressure data (assuming the variable sex is actually measures treatment/control).

    Code:
    sysuse bpwide.dta, clear
    rename sex Treatment
    label define sex 0 "control" 1 "treatment", modify
    gen Dif= bp_after- bp_before
    
    reg Dif  Treatment //I believe this demonstrates that there is not sig. difference between the treatment and control group
    
    ///My question is whether I should be using the "Dif" variable or the "bp_before/bp_after" variable to estimate what sample size I need to detect a sig. difference of 2 units.
    
    sum Dif
    power twomeans -5.091667 -3.091667, sd(16.7136) power(.8)
    
    sum  bp_before
    power twomeans 156.45 154.45, sd( 11.38985 ) power(.8)
    
    sum  bp_after
    power twomeans 151.3583  149.3583, sd( 14.17762  ) power(.8)
    It intuitively makes sense to me to want to use the parameters of the “Dif” variable, since I’m interested in differences between the control and treatment group. However, the mean and STD on difference always lead to very high estimated sample sizes due to the large variance in treatment effects! Therefore, I feel inclined to use the sample mean but am not sure if it is more logical to use estimates from T1 or T2, or from the control of experimental group.

    Additionally, if I wanted to calculate the sample size needed for a subgroup analysis (such as agegrp=49-59) to yield reasonable estimates of effects at set power levels, should I still use mean and standard deviation from the total population?

    Any help would be kindly appreciated, and I would be happy to provide additional information as required.
Working...
X