Linear regression & interaction terms

Tanja De

Join Date: Apr 2016
Posts: 2

Linear regression & interaction terms

19 Apr 2016, 15:42

Hey everybody,

I have a linear regression model where I used interaction terms to see if my two treatments modify the relationship between originality of an answer and three other dimensions (fluency, flexibility, elaboration).

Code:

. reg originality treat1 treat2 fluency flexibility elaboration treat1_flu treat1_flex treat1_elab treat2_fl
> u treat2_flex treat2_elab, robust

Linear regression                               Number of obs     =        178
                                                F(11, 166)        =       4.64
                                                Prob > F          =     0.0000
                                                R-squared         =     0.2189
                                                Root MSE          =     .06348

------------------------------------------------------------------------------
             |               Robust
 originality |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      treat1 |   .0945984   .0417611     2.27   0.025     .0121469    .1770498
      treat2 |   .0959994   .0397926     2.41   0.017     .0174346    .1745642
     fluency |   .0011194   .0021622     0.52   0.605    -.0031495    .0053883
 flexibility |    .007972   .0051042     1.56   0.120    -.0021056    .0180495
 elaboration |   .0054839   .0029268     1.87   0.063    -.0002947    .0112626
  treat1_flu |   .0065834   .0044946     1.46   0.145    -.0022905    .0154574
 treat1_flex |  -.0180415   .0088497    -2.04   0.043     -.035514   -.0005691
 treat1_elab |  -.0003588   .0051295    -0.07   0.944    -.0104864    .0097687
  treat2_flu |    .003234   .0029645     1.09   0.277     -.002619     .009087
 treat2_flex |  -.0100783   .0060007    -1.68   0.095    -.0219259    .0017692
 treat2_elab |  -.0044916   .0038248    -1.17   0.242    -.0120432    .0030599
       _cons |   .7248947   .0313272    23.14   0.000     .6630435    .7867458
------------------------------------------------------------------------------

Although there are statistically significant coefficients for some interaction terms in the model, when I run a test if the overall interaction is significant, it is not.

Code:

test treat1_flu treat2_flu treat1_flex treat2_flex treat1_elab treat2_elab

 ( 1)  treat1_flu = 0
 ( 2)  treat2_flu = 0
 ( 3)  treat1_flex = 0
 ( 4)  treat2_flex = 0
 ( 5)  treat1_elab = 0
 ( 6)  treat2_elab = 0

       F(  6,   166) =    1.15
            Prob > F =    0.3380

How is that possible? Is my approach correct? Does this mean that I should leave all interaction terms out of the regression and only use the main effects? Like this:

Code:

reg originality treat1 treat2 fluency flexibility elaboration

Thanks a lot in advance!

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30194
#2

19 Apr 2016, 16:59

This situation is not uncommon. And it doesn't mean you did anything wrong in your modeling. The joint significance of a group of variables is not a simple function of the p-values of each of the variables. First note that only one of these variables as a p-value < 0.05 (the conventional significance level), and that one only barely so. The test of significance for a single variable basically looks at that variable's coefficient divided by its standard error and compares it to a threshold in the t distribution. But the test of significance for several variables jointly looks, more or less, at the sum of the squares of the coefficients and their cross-products, divided by a combined variance estimator, and compares that to a threshold in the F distribution. (Geometrically, the one-variable test identifies whether a point falls in an interval; the multiple variable test identifies whether a point falls in a multi-dimensional ellipsoid, and an ellipsoid whose axes are usually oblique to the variable axes.) Even if one term in this sum is somewhat large, if the others are small, the net effect may be a small F statistic, as in your case.

If the goal of the research was to identify whether either treatment modifies the effects of fluency, flexibility, and elaboration, then the job is done: the answer is no, and you don't need to do the second regression, as you already have your answer. If, on the other hand, you want to additionally identify the direct effects of the treatments, or of fluency etc., then the second regression would be appropriate.

If you originally had two separate hypotheses to test, one about treatment 1, and the other about treatment 2, then instead of the omnibus test you did, you should separately

Code:

test treat1_flu treat1_flex treat1_elab test treat2_flu treat2_flex treat2_elab

There may be other sets of test that are germane. It really depends on what your research hypothesis was--you have to tailor the test to it.
2 likes
Comment
Tanja De

Join Date: Apr 2016

Posts: 2
#3

19 Apr 2016, 18:00

Thanks a lot, that was really helpful! I have one last question, if you would be so kind?

If the goal of the research was to identify whether either treatment modifies the effects of fluency, flexibility, and elaboration, then the job is done: the answer is no, and you don't need to do the second regression, as you already have your answer. If, on the other hand, you want to additionally identify the direct effects of the treatments, or of fluency etc., then the second regression would be appropriate.

I actually started with the second regression because I want to know the direct effects of the treatments and the three variables, and then added the interaction effects, as I thought I should also check whether either treatment modifies the three variables. According to your statement, this would mean that for the direct effects I would look at the second regression without interaction terms and could not use the coefficients of the first regression with interaction terms. Is that because of the joint insignificance?
Could I report it separately like that, arguing that the coefficients for the direct effects of the treatments or fluency of the first regression with interaction terms etc. are not "valid" due to this joint insignificance, so that I use the ones of the second regression without interaction terms? I'm not quite sure how to structure my report...

Sorry to ask such beginner questions, it's the first time I'm writing an empirical paper.

Thank you!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30194
#4

19 Apr 2016, 21:24

According to your statement, this would mean that for the direct effects I would look at the second regression without interaction terms and could not use the coefficients of the first regression with interaction terms. Is that because of the joint insignificance?

Yes. Your data have demonstrated that they are consistent with the absence of effect modification. So the model without interaction terms would be used instead to estimate the treatment effects (and the covariate effects if that is part of the plan).

Could I report it separately like that, arguing that the coefficients for the direct effects of the treatments or fluency of the first regression with interaction terms etc. are not "valid" due to this joint insignificance, so that I use the ones of the second regression without interaction terms? I'm not quite sure how to structure my report..

The interaction term coefficients are valid, they're just not distinguishable from zero in your data, and collectively they do not signal the presence of any effect modification.

So based on your explanation, I would probably start my report with the no-interaction-terms regression and discuss the findings about the treatments (and the covariates if that is part of the goal), and then I would simply add a comment that you did an additional analysis to see if treatment effects were modified by the covariates, and you found no evidence that they were. If you, or your intended audience, had expectations before the analysis that there should be effect modification, you might try to explain away the absence of an effect modification signal as a matter of statistical power or noisy measurements--assuming, that is, that you really are underpowered or have noisy measurements. With 178 observations and 3 treatment groups, if they are of about equal size, you have about 60 observations per group, which should be enough to detect at least moderately large interaction effects if your measurements are precise, but probably wouldn't detect small ones, and would also fail with low-reliability measurements.
Comment

Announcement

Linear regression & interaction terms

Comment

Comment

Comment