A suggestion for a ordered-logistic longitudinal analysis on Autism

Gianfranco Di Gennaro

Join Date: Nov 2020

Posts: 134
#1

A suggestion for a ordered-logistic longitudinal analysis on Autism

01 Jul 2024, 06:52

Dear all,

I have a question concerning a longitudinal analysis.

I have 62 patients (autistic children; variable "patient_id") attending 10 different hospitals (variable "hospital").
Each patient was treated with a therapy over a varying number of weekly visits. Some children had 9 visits, others 12 visits, others 17 visits, and so on.

My outcome (variable "outcome") is a behavioral score ranging from 1 to 6.

I need a suggestion concerning my analysis.
I have no preplanned hypotheses (e.g., that visit 15 will be better than visit 1, or something like that).

I ran an ordered logistic model: meologit outcome i.Visit || hospital: || patient_id:, or

However, I have the following questions:
Would you use visit 1 as the baseline, considering it can greatly differ from one patient to another? That is: meologit outcome i.Visit i.baseline if Visit>1 || hospital: || patient_id:, or

Would you use Visit_number as a continuous variable to capture the overall effect of time on the outcome? Given that I expect a non-linear effect of Visit_number, I would consider using spline functions to account for this. However, after using splines, how would you get and quantify the overall effect of Visit_number?

How would you plot the effect of time? Would you use margins and marginsplot to get the Odds Ratios for each Visit_number (referenced to Visit 1)?

Thank you for everything.
Gianfranco

PS: if ti can help, data are organized as follows:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input str4 Id byte Visit_number str9 Hospital byte Outcome "SC" 1 "Bergamo" 2 "SC" 2 "Bergamo" 1 "SC" 3 "Bergamo" 2 "SC" 4 "Bergamo" 2 "SC" 5 "Bergamo" 3 "SC" 6 "Bergamo" 3 "SC" 7 "Bergamo" 4 "SC" 8 "Bergamo" 2 "SC" 9 "Bergamo" 4 "CR" 1 "Campania" 4 "CR" 2 "Campania" 4 "CR" 3 "Campania" 4 "CR" 4 "Campania" 4 "CR" 5 "Campania" 4 "CR" 6 "Campania" 4 "CR" 7 "Campania" 5 "CR" 8 "Campania" 5 "CR" 9 "Campania" 5 "CR" 10 "Campania" 5 "CR" 11 "Campania" 5 "CR" 12 "Campania" 5 "CR" 13 "Campania" 5 "CR" 14 "Campania" 5 "CR" 15 "Campania" 5 "DC" 1 "Campania" 3 "DC" 2 "Campania" 3 "DC" 3 "Campania" 3 "DC" 4 "Campania" 3 "DC" 5 "Campania" 4 "DC" 6 "Campania" 4 end
Tags: longitudinal, ordered regression
Clyde Schechter

Join Date: Apr 2014

Posts: 29792
#2

01 Jul 2024, 11:51

As you have no prior hypotheses, why not do graphical exploration of the data first? Then you will be better positioned to find a good specification of the effect of visit number on your outcome.

Also, with only 10 hospitals, and, on average just 6.2 patients at each, when it comes time to do analytic modeling, I would probably not use a random effect at the hospital level. I'd just add i.hospital as a fixed effect in the model--and maybe even remove it if the hospital effects turn out to be, for practical purposes, zero.

Assuming you do end up running a model along the lines you describe, you can either analyze only visits 2 and onward while including visit1 outcome as a covariate, or you can analyze all visits, and not have any covariate for visit 1. There is an algebraic equivalence between these two models, but they are statistically different, because including visit 1 as a covariate treats it as a constant for each person, measured without error. To choose between these approaches, I think the most important issue is whether visit 1 outcome is measured in the same way as the other visits' outcomes or differently. For example, if this "baseline" assessment is done in a different environment, or operated by different personnel, or self-administered (parent-administered) then I prefer to include it as a covariate and forego the visit 1 observations. But if it is measured using the exact same procedure as it is at the other visits, then I prefer having visit 1 observations and no baseline value covariate. Also, if the outcome variable is truly ordinal and cannot be plausibly treated as interval or ratio, then it makes no sense to include it as a continuous covariate in the model since neither multiplying its value by a coefficient nor adding that to something else is meaningful for ordinal level variables. So if you come down on using it as a covariate and it is truly ordinal, you should introduce it as a discrete (i.) variable.

Finally, is it necessary to treat the outcome measure as just ordinal? Can you treat it as a continuous variable? Ordinal logistic regressions are complicated to interpret, and multi-level ordinal logistic regressions even more so. If you can make a plausible argument for modeling it as continuous, it will make your life simpler.
1 like
Comment
Gianfranco Di Gennaro

Join Date: Nov 2020

Posts: 134
#3

01 Jul 2024, 12:51

Thanks Clyde Schechter , as always, for your precious support.

Since you have no prior hypotheses, why not do a graphical exploration of the data first? Then you will be better positioned to find a good specification of the effect of visit number on your outcome.

I will do it. Anyway, in general, these children improve rapidly and plateau after 7-8 visits.

Also, with only 10 hospitals, and, on average just 6.2 patients at each, when it comes time to do analytic modeling, I would probably not use a random effect at the hospital level. I'd just add i.hospital as a fixed effect in the model--and maybe even remove it if the hospital effect turns out to be, for practical purposes, zero.

I agree. I'll do it like this.

Assuming you end up running a model along the lines you describe, you can either analyze only visits 2 and onward while including visit1 outcome as a covariate, or you can analyze all visits, and not have any covariate for visit 1. There is an algebraic equivalence between these two models, but they are statistically different, because including visit 1 as a covariate treats it as a constant for each person, measured without error.

Interesting, I'll look for references and look into it further.

To choose between these approaches, I think the most important issue is whether visit 1 outcome is measured in the same way as the other visits' outcomes or differently. For example, if this “baseline” assessment is done in a different environment, or operated by different personnel, or self-administered (parent-administered) then I prefer to include it as a covariate and forego the visit 1 observations. But if it is measured using the exact same procedure as it is at the other visits, then I prefer having visit 1 observation and no baseline value covariates.

It is measured in exactly the same way. Mine, as you can see, is an attempt to have a proxy for the severity of the child's condition, so as to have estimates adjusted for severity. I usually do this with randomized clinical trials as ANCOVA to have more precise estimates.
To be honest, I would suggest excluding patients from eligibility beyond a certain baseline, to have a more uniform sample. Do you think this is an adequate strategy? A kid with a high baseline cannot improve so much, and it is not representative of the tipical patient of this study. Also, if I am not mistaken, this should cause an increase in variance and make the estimates more unstable. Is it correct?

Also, if the outcome variable is truly ordinal and cannot be plausibly treated as interval or ratio, then it makes no sense to include it as a continuous covariate in the model since neither multiplying its value by a coefficient nor adding that to something else is meaningful for ordinal level variables. So if you come down on using it as a covariate and it is truly ordinal, you should introduce it as a discrete (i.) variable.

I agree. in fact my intention, in the post, was: meologit outcome i.Visit i.baseline if Visit>1 || hospital : || patient_id:, or

Finally, is it necessary to treat the outcome measure as just ordinal? Can you treat it as a continuous variable? Ordinal logistic regressions are complicated to interpret, and multi-level ordinal logistic regressions even more so. If you can make a plausible argument for modeling it as continuous, it will make your life simpler.

Well, maybe I'm wrong, but in order to use the outcome as continuous, a reasonable equidistance between the different levels should be demonstrated. And I do not have this evidence. However, I should have read that treating Likert levels as continuous is a good approximation in most cases. I'm not sure, but there should be a paper in the literature from 2002 or earlier.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29792
#4

01 Jul 2024, 13:51

It is measured in exactly the same way. Mine, as you can see, is an attempt to have a proxy for the severity of the child's condition, so as to have estimates adjusted for severity. I usually do this with randomized clinical trials as ANCOVA to have more precise estimates.
To be honest, I would suggest excluding patients from eligibility beyond a certain baseline, to have a more uniform sample. Do you think this is an adequate strategy? A kid with a high baseline cannot improve so much, and it is not representative of the tipical patient of this study. Also, if I am not mistaken, this should cause an increase in variance and make the estimates more unstable. Is it correct?

If you end up using a mixed model, by including the baseline observation (and not including a baseline outcome covariate) you do adust the analysis for the baseline severity. And this method is preferable because it reflects transparently the fact that the baseline observation is made in the same way as all the others.

As for excluding patients beyond a certain baseline, if, as you say, the more severe patients are not expected to respond to the treatment, or not very much, then it is quite reasonable to say that these patients are "not in universe" for the study and exclude them. All that is then required is that you explain that in any presentation of your methods, and point out that the results therefore also do not generalize to such patients.

Well, maybe I'm wrong, but in order to use the outcome as continuous, a reasonable equidistance between the different levels should be demonstrated. And I do not have this evidence. However, I should have read that treating Likert levels as continuous is a good approximation in most cases. I'm not sure, but there should be a paper in the literature from 2002 or earlier.

You have the gist of the issue correct here. Likert scales are commonly treated as continuous because the assumption that the adjacent response options are at least close to equidistant in cognitive space is widely, though not universally, accepted. But this wide acceptance really applies only to the true Likert scale with responses ranging from "strongly disagree" to "strongly agree" on a 5 or 7 point scale. If your scale is Likert-like, e.g. "greatly improved," "somewhat improved," "no change," "somewhat deteriorated," "greatly deteriorated," the decision to treat it as continuous would be greeted with greater skepticism, absent an evidence-based justification.
1 like
Comment

Announcement

A suggestion for a ordered-logistic longitudinal analysis on Autism

Comment

Comment

Comment