Hello, I am working on a project where we will be retrospectively looking at a continuous lab measurement (say, Hb) at baseline and over time for three unique groups of patients on three different drugs (encoded by the categorical variable drug which contains the categories 1, 2, and 3). Time during the study period will be encoded by the continuous variable time. Measurements were taken at non-uniform timepoints so it's importnat that time is modelled. Patients are assigned the unique identifier id. We want to compare changes in Hb over time between drugs. There are some covariates but let's just include age here.
I understand that multilevel mixed effects models can be used. However, I am completely new to this type of analysis so really having a lot of questions about how it should be specified, which variables should be specified for allowing random slope, which covariance structure should be used, etc. Please pardon me if I made any rookie mistake below (as I am a rookie). Also, the data is yet to be collected so I can only speak hypothetically here and cannot provide any actual Stata output. My questions are as follows:
1. In this case would it be fair to specify the drug variable for random slopes?
2. If so, is there any rule to determine the covariance structure? I noted from some reading that this is largely determined by how the data was sampled and the nature of the data, but would appreciate some pointers as to what to use. From what I read though, covariance(unstructured) appears the most generalized and tolerating?
3. If the above were true, would it be correct to use the following command to model changes in Hb over time and compare between drugs? To model the effect of time, does the time variable need to be specified within the random effects equation too?
4. For reporting purposes, it would be of interest to report the p value that compares the change in Hb between drugs, as well as an estimate of the rate of change in Hb for each drug. As far as I understand, the p value for the terms i.drug##c.time should correspond to the desired p value for each drug compared to 1.drug. However, I am at a loss as to how the rate of change in Hb for each drug can be estimated. This does also seem to rely on the assumption that the trend is linear, so I do have my doubts about this methodology, although I took this approach from a somewhat similar paper I read previously (please kindly see attached).
5. I understand that what I am trying to do is somewhat like a DID analysis with 3 instead of 2 treatment groups. Is DID (didregress or xtdidregress) capable of handling >2 treatment groups, and does it model the random intercepts for clusters (in this case each patient)? These are kind of the only reasons I went for multilevel mixed effects model, and if DID would turn out to be simpler but still accurate I would be more than happy to use it (obviously).
Thank you so much in advance to anyone that responds!
I understand that multilevel mixed effects models can be used. However, I am completely new to this type of analysis so really having a lot of questions about how it should be specified, which variables should be specified for allowing random slope, which covariance structure should be used, etc. Please pardon me if I made any rookie mistake below (as I am a rookie). Also, the data is yet to be collected so I can only speak hypothetically here and cannot provide any actual Stata output. My questions are as follows:
1. In this case would it be fair to specify the drug variable for random slopes?
2. If so, is there any rule to determine the covariance structure? I noted from some reading that this is largely determined by how the data was sampled and the nature of the data, but would appreciate some pointers as to what to use. From what I read though, covariance(unstructured) appears the most generalized and tolerating?
3. If the above were true, would it be correct to use the following command to model changes in Hb over time and compare between drugs? To model the effect of time, does the time variable need to be specified within the random effects equation too?
Code:
mixed Hb i.drug time i.drug##c.time age || id: drug, covariance(unstructured)
5. I understand that what I am trying to do is somewhat like a DID analysis with 3 instead of 2 treatment groups. Is DID (didregress or xtdidregress) capable of handling >2 treatment groups, and does it model the random intercepts for clusters (in this case each patient)? These are kind of the only reasons I went for multilevel mixed effects model, and if DID would turn out to be simpler but still accurate I would be more than happy to use it (obviously).
Thank you so much in advance to anyone that responds!
Comment