How to adjust for matched factors using Conditional Logistic Regression

Felicity Porter

Join Date: Apr 2017

Posts: 9
#1

How to adjust for matched factors using Conditional Logistic Regression

27 Oct 2020, 05:31

Hello. I hope you can help.

I am trying to build a conditional logistic regression model to analyse a 1:4 matched nested case control study. All cases were matched on 5 year age band, sex and were all active in the same GP practice during the same period.

I understand that due to the risk of bias I need to adjust for the matched factors in the model.

The problem I am encountering is that because sex is the same between case and the 4 controls (because it is a matched factor), so when I try to include it in the model, it is "omitted because of no within-group variance." I understand that this is happening because the case and each of their four controls all have the same value and are not therefore a discordant group- so are not included in the analysis. How then would I adjust for sex in my model, without the variable being dropped?

I am also unsure what to do about the age bands. I have the exact date of birth for all cases and controls but they were matched within 5 year bands. Should the age variable which I adjust for in the model use the actual date of births for all cases and controls even though I matched within 5 years? or should I code all controls as the same date of birth as the case which they are matched to- in which case I would surely run into the same problem as earlier, as all my groups would be concordant and therefore dropped from the analysis.

Many thanks!

Last edited by Felicity Porter; 27 Oct 2020, 05:33.
Tags: case-control, conditional, logistic, matching
Tom Scott

Join Date: Apr 2019

Posts: 266
#2

27 Oct 2020, 08:04

.

Last edited by Tom Scott; 27 Oct 2020, 08:45.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#3

27 Oct 2020, 08:13

you don't show us your code so we can't say for sure what is going on; however, note that the "group()" option is required with the variable name of the matched group variable being inserted between the parentheses; see

Code:

help clogit
Comment
Felicity Porter

Join Date: Apr 2017

Posts: 9
#4

27 Oct 2020, 08:55

Rich Goldstein Tom Scott

Thankyou for your replies and help.

I see what you mean about sex being dropped as a variable.

So does this mean that if I have matched for sex and date of birth -individual matching 1:4 - I can not adjust for them in my model because they don't change over time?

How then would I build a logistic regression model which can both acount for 1:4 matched data and where I can adjust for those matched variables?

To explain further, I am intrested in building a causal model for a 1:4 matched nested case control to see if a binary exposure (childhood event) is related to a binary outcome (case). My understanding was that I would have to use the clogit function - conditional logistic regression -to account for the matching and then add the matched variables as variables in the model to avoid bias caused by the matching.

My code was:

. clogit outcomevar exposurevar sex DOB , group(matchid) or base

note: sex omitted because of no within-group variance.

n.b. matchid is the case/control variable identifying the 1:4 matching

It did not drop the DOB (date of birth) variable and I assumed this was beacuse the matching was done within a 5 year age band, so within each group - 1 case/4 controls- there can be up to 5 different values for date of birth (although all the controls DOB/s will be within 5 years of the case.)

But if as you say, you can not use conditional lgositic regression to adjust for stable charactersitics then wny was date of birth - which IS an all time stable characteristic of an individual - not dropped from the model?

So I am rather confused?

Many thanks

Last edited by Felicity Porter; 27 Oct 2020, 09:03.
Comment
Felicity Porter

Join Date: Apr 2017

Posts: 9
#5

27 Oct 2020, 09:15

Rich Goldstein Tom Scott

FYI

This article by Neil Pearce, Analysis of matched case-control studies BMJ 2016; 352 :i969 (https://www.bmj.com/content/352/bmj.i969) is the reason that I am trying to control for the matched factors in the analysis.

If matching is carried out on a particular factor such as age in a case-control study, then controlling for it in the analysis must be considered. This control should involve just as much precision as was used in the original matching14 (eg, if exact age in years was used in the matching, then exact age in years should be controlled for in the analysis), although in practice such rigorous precision may not always be required (eg, five year age groups may suffice to control confounding by age, even if age matching was done more precisely than this). In some circumstances, this control may make no difference to the main exposure effect estimate—eg, if the matching factor is unrelated to exposure. However, if there is an association between the matching factor and the exposure, then matching will introduce confounding that needs to be controlled for in the analysis.

Many thanks
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#6

27 Oct 2020, 09:40

Felicity Porter I deleted my post because I may have been wrong about what I said. I will note that the article you cited describes controlling for matched characteristics in an unconditional logistic regression (see help logit) instead of a conditional logistic regression (see help clogit), which already accounts for the matched characteristics by grouping observations into their matched groups. I can't speak to the veracity of the article's argument, but you could do both methods by first using clogit without adding the matching characteristics in the model but grouping matched groups and second by using logit and adding the matched characteristics in the model without accounting for the groups.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#7

27 Oct 2020, 09:48

Thanks for the link to this article, which is useful.

To my understanding and following Pearce. "So when is a pair matched analysis required? The answer is, when the matching was genuinely at (or close to) the individual level." (Pearce, p. 3). You matched individuals 1:4, right?, so this would apply. -clogit- is a matched analysis, with group() defining the matching variables.

And, if you are doing a matched analysis, I don't know of any way to estimate the effect of the matching variables, at least not with -clogit-. (Perhaps others will know of other possibilities.) The effect estimate in a matched analysis is, at least conceptually, sort of an average of the within-matched-set effects. But within matched set, there is no variation in your matching variables. The old dictum "a constant can't explain a variable" thus applies here. That's why Strata objected to using sex as a predictor. Now, why didn't it object to DOB as a predictor? 1) It encountered sex before DOB in your command, and stopped before even looking at potential problems with DOB; 2) There is *some* variation in DOB within a matched set defined by a shared age band. However, in matching on age band, you likely have grossly reduced your ability to estimate the effect of DOB, as the variation of DOB within age band, while not 0, is relatively small. If there's not much variation in a predictor, estimates of its effect are inefficient and likely biased, as it's a classic case of a restricted range of a predictor. And, are you sure you're expecting a linear effect of DOB within age band? That's what your model would imply.

My view here would be that, if you wanted to estimate the effect of DOB, matching on it via age band was not a helpful choice.

Last edited by Mike Lacy; 27 Oct 2020, 09:51.
Comment
Felicity Porter

Join Date: Apr 2017

Posts: 9
#8

27 Oct 2020, 11:21

Thanks very much Mike Lacy

Yes, I don't see any way to adjust for matching factors with a conditional logistic regression model.

I think the only option then is to run the analysis using unconditional logistic regression, adjusting for matching factors- then as a sensitivity analyses I could use a conditional model to show that the results are consistent (which they are)

But then is my DOB variable okay for my unconditional model - the restricted range problem shouldn't matter in an unmatched analysis- is that right?

so the code would be;

logit outcomevar exposurevar sex DOB

And yes, I matched individuals 1:4 as you suggested,

Many thanks
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#9

27 Oct 2020, 11:25

Yes, the range wouldn't be restricted in an unconditional logistic regression because you wouldn't be modeling the regression within age bands
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#10

27 Oct 2020, 11:45

My thinking would be that, unless you adjust for the stratified sampling on the response variable, the standard errors from an unconditional analysis would be wrong. Also, my intuition is that the restricted range thing will still be an issue here. You might have an "unconditionally unrestricted range" (to coin a phrase), but the fact that the range of DOB is linked to your matched clusters seems like a potential problem. I would suggest strongly looking for a good model in the literature of using an unconditional analysis with individual matching, given that many, including apparently Pearce, would argue against that. There are some very sharp and helpful biostat/epi folk on StataList, and perhaps they can point to some examples or directly advise you here.
Comment
Felicity Porter

Join Date: Apr 2017

Posts: 9
#11

28 Oct 2020, 08:07

Thanks both. I wonder how to reach those very sharp and helpful biostat/epi folks on here?!
Comment

Announcement

How to adjust for matched factors using Conditional Logistic Regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment