Bad control when the relationship is indirect

Eric Rusk

Join Date: Oct 2024

Posts: 2
#1

Bad control when the relationship is indirect

28 Oct 2024, 09:32

I’m working on a project where we’re trying to estimate the impact of country fragility status on project outcomes. We have data on the total grant funding allocated to each project, which we’ve been including as a control variable in our regressions. However, there’s a concern that grant size may be correlated with fragility status, because the formula used for allocating grants gives countries with certain characteristics (e.g., poorer countries, small island countries) larger grants. While fragility status is not factored in the formula, the formula takes into account characteristics that are strongly associated with fragility.

In addition to multicollinearity concerns, could grant size act as a "bad control" that introduces bias in this case?
Tags: bad control, bias, control variables, correlation, regression
Erik Ruzek

Join Date: Oct 2017

Posts: 398
#2

28 Oct 2024, 13:55

Hi Eric,

You mention two different issues. In the title you state that the relationship is indirect. By this, do you mean that country fragility status impacts project outcomes in part because it also impacts grant funding? If so, then no, you should not control for grant funding if your inference target is the average causal effect of fragility status on project outcomes. This is what is sometimes termed overcontrol bias, and following the logic of directed acyclic graphs, controlling for a mediator blocks the path of the average causal effect. See A Crash Course in Good and Bad Controls - Carlos Cinelli, Andrew Forney, Judea Pearl, 2024.

The second issue is related to the statistical phenomenon that grant funding is highly correlated with fragility status (but is not a mediator). As noted in this Statalist thread, you should avoid adding control variables that are highly correlated with your treatment variable because these have the effect of increasing the standard error of the coefficient for the treatment variable. This reduces your statistical power.

Unless I am misunderstanding your issues, the answer to both that are raised is to exclude grant funding from your regression model.
2 likes
Comment
Eric Rusk

Join Date: Oct 2024

Posts: 2
#3

28 Oct 2024, 16:22

Hi Erik,

Thank you very much for your response!

To clarify my concern on the first issue: country fragility status itself does not directly impact grant funding, but it is correlated with factors that do. In this context, I’m wondering if overcontrol bias would still be an issue. Given that fragility status isn’t a direct determinant of grant funding, would it still introduce the kind of bias you mentioned?
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 398
#4

28 Oct 2024, 21:13

Eric,

It would not induce overcontrol bias but it would lead to the second issue mentioned - inflating the standard error of the coefficient for fragility status, thereby reducing your power to detect the true effect of interest. If you were trying to predict fragility status, then you would probably use grant funding as a predictor. But you are trying to project outcomes.
Comment

Announcement

Bad control when the relationship is indirect

Comment

Comment

Comment