Hello everyone,
We are working with a score (ranged from 0 to 14, ich_total_T*) in a clinical trial. We have 3 different groups (Control, Intervención Corta and Intervención Larga) and three different time-points (baseline, first follow-up and second follow-up). We are obtaining the score change over time, between and within groups, and our problem is in the change within groups.
1. We generate a variable which is the difference between the score in the second follow-up and the score in the baseline (gen ichdif_T2T0= ich_total_T2-ich_total_T0) and we obtain these crude mean differences:
tab EstadoT0, sum(ichdif_T2T0)
| Summary of ichdif_T2T0
EstadoT0 | Mean Std. Dev. Freq.
-------------------+------------------------------------
Control | -.30054496 1.8938961 367
Intervención corta | -.34625 1.8893883 400
Intervención larga | -.06280488 1.7489841 328
-------------------+------------------------------------
Total | -.2460274 1.8522985 1,095
However, we want to obtain these differences from mixed models in order to adjust from some relevant variables in our study (random effects such as municipality and school). In order to do that:
2. We make a reshape with our variable of interest:
reshape long ich_total_T , i(id) j(Tiempo)
(note: j = 0 1 2)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 1326 -> 3978
Number of variables 690 -> 689
j variable (3 values) -> Tiempo
xij variables:
ich_total_T0 ich_total_T1 ich_total_T2 -> ich_total_T
-----------------------------------------------------------------------------
3. Then, we obtain the difference between the score in the second follow-up and the score in the baseline (adjusting by municipality, school and children ID):
mixed ich_total_ i.Tiempo if Tratamiento==1 & Tiempo!=1 & Seguimiento_ichtotalT0T2==1|| ZoneT0:|| IDST0:|| id:
Note: Seguimiento_ichtotalT0T2==1 is a variable used to select only individuals with data at baseline and at first follow up and Tratamiento==1 is the Control group.
Performing EM optimization:
Performing gradient-based optimization:
Iteration 0: log likelihood = -1390.9927
Iteration 1: log likelihood = -1390.9421
Iteration 2: log likelihood = -1390.9409
Iteration 3: log likelihood = -1390.9409
Computing standard errors:
Mixed-effects ML regression Number of obs = 734
-------------------------------------------------------------
| No. of Observations per Group
Group Variable | Groups Minimum Average Maximum
----------------+--------------------------------------------
ZoneT0 | 2 160 367.0 574
IDST0 | 8 44 91.8 138
id | 367 2 2.0 2
-------------------------------------------------------------
Wald chi2(1) = 9.27
Log likelihood = -1390.9409 Prob > chi2 = 0.0023
---------------------------------------------------------------------------------
ich_total_T | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
Tiempo |
2ºFollow up T2 | -.300545 .0987258 -3.04 0.002 -.494044 -.107046
_cons | 10.55242 .1540566 68.50 0.000 10.25048 10.85437
---------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
ZoneT0: Identity |
var(_cons) | 1.60e-12 . . .
-----------------------------+------------------------------------------------
IDST0: Identity |
var(_cons) | .1275942 .0836116 .0353219 .4609113
-----------------------------+------------------------------------------------
id: Identity |
var(_cons) | .9260181 .151003 .6726925 1.274742
-----------------------------+------------------------------------------------
var(Residual) | 1.788534 .132032 1.547606 2.06697
------------------------------------------------------------------------------
LR test vs. linear model: chi2(3) = 68.24 Prob > chi2 = 0.0000
Note: LR test is conservative and provided only for reference.
As you can see, the mean difference in this model and the crude mean difference is the same (-0.300545). This also happens in the Intervención Corta and Intervención Larga groups. We think that this mixed model is not adjusting by our random effects. What do you think that the problem is?
3. Moreover, in order to see if the random effects have an effect in our model, we also study the between groups differences (without the reshape) doing: mixed ichdif_T2T0 i.Tratamiento|| ZoneT0:|| IDST0:
And we make a margins in this model to obtain the mean differences within groups and with this result:
margins i. Tratamiento
Adjusted predictions Number of obs = 1,095
Expression : Linear prediction, fixed portion, predict()
-------------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
--------------------+----------------------------------------------------------------
Tratamiento |
Control | -.5110921 .3596001 -1.42 0.155 -1.215895 .1937111
Short Intervention | -.4115677 .3562268 -1.16 0.248 -1.10976 .286624
Long Intervention | -.2760756 .3608734 -0.77 0.444 -.9833744 .4312233
-------------------------------------------------------------------------------------
As you can see, the results are quite different because we think that in this model the adjustment by random effects is done. However, we think that the results from the margins in this model and the ones from the reshape should be similar. Is this approach correct? What can be our the problem?
Important remark: this problem does not happen only with these date, we made similar models with other data and the problem remains, so we think it can be a methodological/technical problem rather than a specific problem with the data. We are using Stata 15.
We are working with a score (ranged from 0 to 14, ich_total_T*) in a clinical trial. We have 3 different groups (Control, Intervención Corta and Intervención Larga) and three different time-points (baseline, first follow-up and second follow-up). We are obtaining the score change over time, between and within groups, and our problem is in the change within groups.
1. We generate a variable which is the difference between the score in the second follow-up and the score in the baseline (gen ichdif_T2T0= ich_total_T2-ich_total_T0) and we obtain these crude mean differences:
tab EstadoT0, sum(ichdif_T2T0)
| Summary of ichdif_T2T0
EstadoT0 | Mean Std. Dev. Freq.
-------------------+------------------------------------
Control | -.30054496 1.8938961 367
Intervención corta | -.34625 1.8893883 400
Intervención larga | -.06280488 1.7489841 328
-------------------+------------------------------------
Total | -.2460274 1.8522985 1,095
However, we want to obtain these differences from mixed models in order to adjust from some relevant variables in our study (random effects such as municipality and school). In order to do that:
2. We make a reshape with our variable of interest:
reshape long ich_total_T , i(id) j(Tiempo)
(note: j = 0 1 2)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 1326 -> 3978
Number of variables 690 -> 689
j variable (3 values) -> Tiempo
xij variables:
ich_total_T0 ich_total_T1 ich_total_T2 -> ich_total_T
-----------------------------------------------------------------------------
3. Then, we obtain the difference between the score in the second follow-up and the score in the baseline (adjusting by municipality, school and children ID):
mixed ich_total_ i.Tiempo if Tratamiento==1 & Tiempo!=1 & Seguimiento_ichtotalT0T2==1|| ZoneT0:|| IDST0:|| id:
Note: Seguimiento_ichtotalT0T2==1 is a variable used to select only individuals with data at baseline and at first follow up and Tratamiento==1 is the Control group.
Performing EM optimization:
Performing gradient-based optimization:
Iteration 0: log likelihood = -1390.9927
Iteration 1: log likelihood = -1390.9421
Iteration 2: log likelihood = -1390.9409
Iteration 3: log likelihood = -1390.9409
Computing standard errors:
Mixed-effects ML regression Number of obs = 734
-------------------------------------------------------------
| No. of Observations per Group
Group Variable | Groups Minimum Average Maximum
----------------+--------------------------------------------
ZoneT0 | 2 160 367.0 574
IDST0 | 8 44 91.8 138
id | 367 2 2.0 2
-------------------------------------------------------------
Wald chi2(1) = 9.27
Log likelihood = -1390.9409 Prob > chi2 = 0.0023
---------------------------------------------------------------------------------
ich_total_T | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
Tiempo |
2ºFollow up T2 | -.300545 .0987258 -3.04 0.002 -.494044 -.107046
_cons | 10.55242 .1540566 68.50 0.000 10.25048 10.85437
---------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
ZoneT0: Identity |
var(_cons) | 1.60e-12 . . .
-----------------------------+------------------------------------------------
IDST0: Identity |
var(_cons) | .1275942 .0836116 .0353219 .4609113
-----------------------------+------------------------------------------------
id: Identity |
var(_cons) | .9260181 .151003 .6726925 1.274742
-----------------------------+------------------------------------------------
var(Residual) | 1.788534 .132032 1.547606 2.06697
------------------------------------------------------------------------------
LR test vs. linear model: chi2(3) = 68.24 Prob > chi2 = 0.0000
Note: LR test is conservative and provided only for reference.
As you can see, the mean difference in this model and the crude mean difference is the same (-0.300545). This also happens in the Intervención Corta and Intervención Larga groups. We think that this mixed model is not adjusting by our random effects. What do you think that the problem is?
3. Moreover, in order to see if the random effects have an effect in our model, we also study the between groups differences (without the reshape) doing: mixed ichdif_T2T0 i.Tratamiento|| ZoneT0:|| IDST0:
And we make a margins in this model to obtain the mean differences within groups and with this result:
margins i. Tratamiento
Adjusted predictions Number of obs = 1,095
Expression : Linear prediction, fixed portion, predict()
-------------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
--------------------+----------------------------------------------------------------
Tratamiento |
Control | -.5110921 .3596001 -1.42 0.155 -1.215895 .1937111
Short Intervention | -.4115677 .3562268 -1.16 0.248 -1.10976 .286624
Long Intervention | -.2760756 .3608734 -0.77 0.444 -.9833744 .4312233
-------------------------------------------------------------------------------------
As you can see, the results are quite different because we think that in this model the adjustment by random effects is done. However, we think that the results from the margins in this model and the ones from the reshape should be similar. Is this approach correct? What can be our the problem?
Important remark: this problem does not happen only with these date, we made similar models with other data and the problem remains, so we think it can be a methodological/technical problem rather than a specific problem with the data. We are using Stata 15.
Comment