regression with multiple interaction terms that interact a continuous variable with a dummy variable

Flora Yin

Join Date: Aug 2021
Posts: 68

regression with multiple interaction terms that interact a continuous variable with a dummy variable

22 Mar 2022, 09:45

Dear statalist,

I have a probit model with several interaction terms, the model is

Code:

probit DV c.var1##i.H_d1 c.var1##i.H_d2 c.var1##i.H_d3 controls i.industry i.year ,vce(cluster symbol)

where DV is a dummy, so I use probit model; var1 is a continuous variable; H_d1 H_d2 H_d3 are 3 dummies, H_d1=1 if the value is above industry median of d1 in a year, similar definitions apply to H_d2 and H_d3.
I use the following code to generate H_d1 (same method for H_d2 and H_d3)

Code:

egen quantiles_d1 = xtile(d1), by(year industry) nq(2)

gen H_d1 = 1 if quantiles_d1 == 1
replace H_d1 = 0 if quantiles_d1 == 2

Later I want to use L_d1 L_d2 L_d3 instead (below industry median in a year would code 1), so I use the following code to generate L_d1 (same method for L_d2 and L_d3)

Code:

egen quantiles_d1 = xtile(d1), by(year industry) nq(2)

gen L_d1 = 1 if quantiles_d1 == 2
replace L_d1 = 0 if quantiles_d1 == 1

So basically L_d1=1 is when H_d1=0. I expect the regression result for the IVs to have exactly the opposite sign and the same coefficient, that is indeed the case for the dummies alone and for the interaction terms, but not so for var1. Specifically, the results look like this

use H_d1 H_d2 H_d3

Code:

-----------------------------------------------------------------------------------------------------
                                    |               Robust
                                 DV |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------------------------+----------------------------------------------------------------
                               var1 |   4.429392   .8964463     4.94   0.000      2.67239    6.186395
                                    |
                             1.H_d1 |    .343354   .1052533     3.26   0.001     .1370614    .5496466
                                    |
                         H_d1#cvar1 |
                                 1  |  -1.466482   .6460026    -2.27   0.023    -2.732623   -.2003396
                                    |
                             1.H_d2 |   .2115119   .0902183     2.34   0.019     .0346873    .3883365
                                    |
                         H_d2#cvar1 |
                                 1  |  -1.016277   .6039841    -1.68   0.092    -2.200065    .1675097
                                    |
                             1.H_d3 |   .2059276    .095487     2.16   0.031     .0187765    .3930786
                                    |
                         H_d3#cvar1 |
                                 1  |  -1.638603    .590966    -2.77   0.006    -2.796875   -.4803306

use L_d1 L_d2 L_d3

Code:

-----------------------------------------------------------------------------------------------------
                                    |               Robust
                                 DV |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------------------------+----------------------------------------------------------------
                               var1 |   .3080308   .8580195     0.36   0.720    -1.373656    1.989718
                                    |
                             1.L_d1 |   -.343354   .1052533    -3.26   0.001    -.5496466   -.1370614
                                    |
                         L_d1#cvar1 |
                                 1  |   1.466482   .6460026     2.27   0.023     .2003396    2.732623
                                    |
                             1.L_d2 |  -.2115119   .0902183    -2.34   0.019    -.3883365   -.0346873
                                    |
                         L_d2#cvar1 |
                                 1  |   1.016277   .6039841     1.68   0.092    -.1675097    2.200065
                                    |
                             1.L_d3 |  -.2059276    .095487    -2.16   0.031    -.3930786   -.0187765
                                    |
                         L_d3#cvar1 |
                                 1  |   1.638603    .590966     2.77   0.006     .4803306    2.796875

I wonder why var1 has very different coefficient and significance level in the two regressions? I'm really confused... Please kindly give me some suggestions, thanks!

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

22 Mar 2022, 11:35

This is to be expected. In fact, something would be wrong if you got the same coefficient for var1 in these models.

The problem is that you are not understanding how coefficients of the constituent effects work when you have interaction models. The coefficient that is labeled var1 in such a model is NOT "the effect" of var1. In fact, in an interaction model, there is no such thing as the effect of var1. Rather that coefficient is an effect of var1, conditional on all the variables with which it is interacted being 0. It is the effect of var1 in the subset of observations where all three of those ?_d* variables are 0.

In the two models you show, L_d1 = 1-H_d1. So if L_d1 == 0, then H_d1 isn't zero, and vice versa. Consequently, the coefficient that is labeled var1 represents two completely different (in fact, in a sense, opposite) things in those two models. In the first model it is the effect for the subset where H_d1 = 0, and in the second it is the effect for the subset where L_d1 == 0. The only way the coefficient could be the same in both models is if all of the interaction coefficients were themselves 0.

As an aside, you could have generated these d* variables more compactly:

Code:

gen byte L_d1 = (quantiles_d1 == 1) if !missing(quantiles) gen byte H_d1 = 1 - L_d1

Last edited by Clyde Schechter; 22 Mar 2022, 11:38.
2 likes
Comment
Flora Yin

Join Date: Aug 2021

Posts: 68
#3

22 Mar 2022, 21:58

Thanks Clyde for your explanation, I understand now!
Comment

Announcement

regression with multiple interaction terms that interact a continuous variable with a dummy variable

Comment

Comment