How to compare two different rates of change in Stata?

Rachel Jones

Join Date: Oct 2015

Posts: 37
#1

How to compare two different rates of change in Stata?

12 Apr 2024, 10:23

Hello all!

I want to compare task performance overtime between my treatment and control groups. Is there a way to analyze the rate of change between the two groups?

I have 20 datapoints as the participants complete the task across 20 trials. I predict that the treatment group will have a faster rate of change than the control. How can I analyze the improvement (or any change) of the task performance across those 20 trials? Is there a way in Stata to measure rate of change across datapoints? I couldnt find one.

And then once I get that rate of change variable, could I just use it as the DV in a regression? Or would I need a different analysis entirely?

Thank you so much!!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

12 Apr 2024, 10:28

Your description of your data confuses me. If you have 20 data points as the participants complete the task across 20 trials, it seems you must have only one participant. So how are there any groups at all, let alone treatment and control groups to contrast. And what are the variables available. I think your best bet is not to try describing the data better, because descriptions of data sets are almost always inadequate no matter how hard one tries, but to show example data. The helpful way to do that is with the -dataex- command. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

If it is not obvious from the variable names in your data set, be sure to also explain how the data identifies the two groups, the trial number, and the individual participant. Be sure the example you choose to show contains data from participants in both groups, and, if not all 20 trials, at least, say a consistent half dozen of them.
Comment

Rachel Jones

Join Date: Oct 2015
Posts: 37

12 Apr 2024, 10:49

Originally posted by Clyde Schechter View Post

Your description of your data confuses me. If you have 20 data points as the participants complete the task across 20 trials, it seems you must have only one participant. So how are there any groups at all, let alone treatment and control groups to contrast. And what are the variables available. I think your best bet is not to try describing the data better, because descriptions of data sets are almost always inadequate no matter how hard one tries, but to show example data. The helpful way to do that is with the -dataex- command. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

If it is not obvious from the variable names in your data set, be sure to also explain how the data identifies the two groups, the trial number, and the individual participant. Be sure the example you choose to show contains data from participants in both groups, and, if not all 20 trials, at least, say a consistent half dozen of them.

I'm really sorry about that! I have about 1000 participants across 2 groups so 500 in each group. Each of those participants had 20 trials of the task. I will try to figure out the dataex command but I have an old version of Stata. The data would look like this. I only have 10 trials to keep it simpler...but how would I analyze how the data from trials 1-20 increased over time? And then how could I compare that rate of change between the treatment vs control (1 vs 0).

Thank you!

Participant ID	Treatment	Trial 1	Trial 2	Trial 3	Trial 4	Trial 5	Trial 6	Trial 7	Trial 8	Trial 9	Trial 10
1	1	5	6	7	8	9	8	9	9	9	9
2	1	5	6	7	8	9	9	9	9	9	9
3	1	4	5	6	6	7	8	9	9	9	9
4	1	5	6	5	5	9	9	8	8	8	8
5	1	5	6	5	6	5	6	7	8	9	9
6	0	5	5	5	5	5	5	5	5	5	5
7	0	5	5	5	5	5	5	5	5	5	6
8	0	5	5	5	4	5	5	5	5	6	6
9	0	5	5	5	4	5	5	5	5	5	5
10	0	5	4	4	4	5	5	5	5	5	5

Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 29796

12 Apr 2024, 11:31

Here are two approaches you can use.

I have included a graphical exploration of the data here. In this example, you are fortunate that many of the participants' performance trajectories are linear, or nearly so. It makes it feasible to use these relatively straightforward regression approaches. If your trajectories were less "well behaved" it could get much more complicated.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(participantid treatment trial1 trial2 trial3 trial4 trial5 trial6 trial7 trial8 trial9 trial10)
 1 1 5 6 7 8 9 8 9 9 9 9
 2 1 5 6 7 8 9 9 9 9 9 9
 3 1 4 5 6 6 7 8 9 9 9 9
 4 1 5 6 5 5 9 9 8 8 8 8
 5 1 5 6 5 6 5 6 7 8 9 9
 6 0 5 5 5 5 5 5 5 5 5 5
 7 0 5 5 5 5 5 5 5 5 5 6
 8 0 5 5 5 4 5 5 5 5 6 6
 9 0 5 5 5 4 5 5 5 5 5 5
10 0 5 4 4 4 5 5 5 5 5 5
end

rename trial* performance*
reshape long performance, i(participantid) j(trial_num)

xtset participantid trial_num
xtline performance, overlay

//    APPROACH 1: FIT A REGRESSION LINE FOR EACH PARTICIPANT
//    THEN REGRESS THE REGRESSION COEFFICIENT ON TREATMENT INDICATOR
rangestat (reg) performance trial_num, by(participantid) interval(trial_num . .)
regress b_trial_num i.treatment // GROUP DIFFERENCE FOUND IN 1.treatment OUTPUT LINE
margins treatment // SHOWS MEAN REGRESSION SLOPE IN EACH GROUP

//    APPROACH 2: FIT AN OVERALL REGRESSION, STRATIFIED BY TREATMENT INDICATOR
//    WITH PARTICIPANT LEVEL RANDOM SLOPE FOR TRIAL NUM
mixed performance c.trial_num##i.treatment || participantid: c.trial_num, ///
    cov(exchangeable) // GROUP DIFFERENCE FOUND IN 1.treatment#c.trial_num OUTPUT LINE
margins treatment, dydx(trial) // SHOWS MEAN REGRESSION SLOPE IN EACH GROUP

-rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer. It is available from SSC.

Comment

Erik Ruzek

Join Date: Oct 2017

Posts: 398
#5

12 Apr 2024, 17:12

Clyde Schechter, can you say why you chose an exchangeable vs. an unstructured covariance structure?

Also, to the OP, Clyde gave you the marginal effect version of the treatment difference. If you want to graph the average predicted trajectories for each group, you can use a different margins call:

Code:

mixed performance c.trial_num##i.treatment || participantid: c.trial_num, cov(exchangeable) margins treatment, at(trial = (1(1)9)) marginsplot
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#6

12 Apr 2024, 17:32

I chose an exchangeable covariance because I understand this to be repeated measures data.
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 398
#7

12 Apr 2024, 17:43

I have the same impression. I ask because In my field (education and psychology), people tend to use an unstructured covariance structure. I'm wondering if choosing to use an exchangeable covariance is more common in your field.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#8

12 Apr 2024, 18:26

Yes, I think most epidemiologists working with repeated-measures data would model it with an exchangeable covariance.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4353
#9

13 Apr 2024, 06:30

Originally posted by Rachel Jones View Post

And then once I get that rate of change variable, could I just use it as the DV in a regression? Or would I need a different analysis entirely?

This implies fitting a regression model for each participant. Instead of using the slope coefficients as a response variable in a further regression model, where they would be taken as known, you might want to consider a path analysis, say, using sem, which I believe would handle the slopes better as estimated and not given.

Also, what's the maximum possible score on these trials? If it's nine or ten, then you seem to be encountering a ceiling effect even before your participants are halfway through their twenty trials. (You also seem to have a floor effect of about four or five.)

With floor and ceiling effects, you're liable to have to deal with heteroscedasticity and to end up with predictions outside the range of possible scores if you include the trial-sequence predictor as continuous.

With a thousand participants, you might be able to get away with it, but in the presence of ceiling effects I'm not sure that a linear model with sequence-as-continuous will capture differences between treatment conditions in the rates of change as sensitively as, say, an ordered-categorical regression model.
1 like
Comment

Announcement