demeaned binary variable

Adrienne Wold

Join Date: Dec 2016

Posts: 139
#1

demeaned binary variable

15 Feb 2022, 08:46

Hi,

I am working with the following model:

Code:

reg y gender##race##imp

where gender and race are binary and imp is a continuous variable measuring strength of opinion on an issue.

Imp is demeaned (measured on a 5 point scale), and I've been recommended to demean race (white and non-white) and gender as well but I'm not sure how to interpret the individual and interaction effects using demeaned binary variables. For example, would the coefficient on gender tell us the effect of moving one s.d. in the pop av of gender on y? This makes less sense to me than using a binary non-demeaned version.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#2

15 Feb 2022, 12:05

This makes less sense to me than using a binary non-demeaned version.

I agree with you.

There are circumstances where it is important to center variables in models with interactions. For example, sometimes that solves numerical instability problems, or convergence issues. It can also affect the interpretation of the relationships between random intercepts and random slopes in multi-level models. But demeaning simple dichotomous variables is seldom useful, and it certainly makes it difficult-to-impossible to interpret the results. I think you were given bad advice. But, in fairness, there may be aspects of this that I'm unaware of, so my advice would be to go back to the person who recommended that and ask why.

Finally, I don't understand what is going on with variable imp. Because it appears in an interaction term with no prefix, Stata will treat it as a discrete variable. If it has been demeaned then it will necessarily contain some negative values, which is not permissible in factor-variable notation. Did you mean -reg y gender##race##c.imp-?
3 likes
Comment
Adrienne Wold

Join Date: Dec 2016

Posts: 139
#3

22 Feb 2022, 07:56

Thanks for the response. The imp variable should indeed be noted as c.imp, thank you for this.

While I'm still figuring out the exact goal of this, I am wondering how we would interpret coefficients if we did proceed with this version and the model is:

y= a + b(race)+c(party)+d(imp) + e(race*party) + f(race*imp) + g(imp*party) + h(race*party*imp)

If I'm interpreting the coefficient on imp, would this interpretation be something along the lines of : a unit increase in the imp scale is associated with a change in Y of B3 for *respondents with mean values of race and party* or *when race and party are at their average values in the sample*.

Further, if I'm trying to interpret the interaction terms focusing on imp again, would I interpret the this as:
d+f+g+h would be the marginal effect of imp. on Y if race and party were one unit greater than their mean values (or if race and party were the categories where race=1 and party=1)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29796
#4

22 Feb 2022, 11:44

Well, in a strictly mathematical sense what you say in #3 is correct. But there is no such thing as "the mean value of race" or "the mean value of party." The use of numbers to represent race and party is just arbitrary and you could assign any set of distinct numbers for the purpose, and that would change your results. So I think this is a misguided approach.

I would model this in a way that respects the categorical nature of the race and party variables.

Code:

regress y c.imp##i.race##i.party

And then, I would ignore the regression output and go straight to -margins- to find the predicted values or marginal effects of interest.
Comment

Announcement

demeaned binary variable

Comment

Comment

Comment