Interpretation of interaction terms in models with standardised variables

Henry Arauz

Join Date: Jul 2018
Posts: 13

Interpretation of interaction terms in models with standardised variables

17 Jul 2018, 12:16

Hello All,

I am running a multiple regression that looks like this:

Code:

reg car c.indexA##c.indexB  controls

I am interested mainly on the impact of indexB on car (cumulative abnormal returns) and whether the interaction with indexA makes any difference.

I have standardised all my variables as I read it was common practice in M&A research, and that it helps to get more steady results. However, since I am using interaction terms, I am struggling with the interpretation of each term and the effect they have on the also standardised dependent variable. Can someone help me understand the interpretation of standardised interactions on a standardised dependent variable? And does it add any value to my regression to have standardised variables? i.e. am I more likely to get significant results if I standardise my variables?

An excerpt of my data is provided below:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(car indexA indexB zlvalue zpremium)
-.04265274 -.031235166  .09289306 -1.3743476          .
 .05864397    .8657029  -2.986656   .5466814  -.7211906
  .9124655   .04350964   .3392571 -1.4847015 -1.1008819
 -.4733196   .04350964   .3392571 -1.1832519  -.6927265
  .3147303  -1.6008778 -2.0012002 -.07337776   .4242368
  2.403794   -2.946285   .3392571 -1.0905318          .
-1.1410446   1.0899378   .3392571 -1.2296888 -1.0978322
  .5169623   -.0760824  .08673403 -.05377782  -.7135663
  .4899784    .4022853   .8135074 -.27336898          .
 .10567318  -.17325082  -3.017451 -1.1589853          .
-.43367136  -2.2735817   .4624389  -.5439201 -.28711247
 -.7486368   1.2394273   .9551666  -.9914123  -.9766055
-.18198207   .26774403 -.03028878 -1.0312611  -.9560198
  .3653311    1.015193  -.2766528  1.1410673 -1.0863957
  .1568007   .41723365   .3392571  -.3966787 -1.0658101
-.13443676   .26774403 -1.1389264  -.9418356 -1.0365835
 .09781788   .19299924  -.2150619 -1.2979195  -.8502958
-.14205901    1.314172  -.2150619  .25849387 -1.0607272
-.02275374   .19299924   .6472117 -1.1530515 -1.0630145
-1.2362403    .7909581  -.2766528   .3465459  -.8231024
end

Thanks in advance for your help.

Best wishes,
Henry

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

17 Jul 2018, 13:47

However, since I am using interaction terms, I am struggling with the interpretation of each term and the effect they have on the also standardised dependent variable. Can someone help me understand the interpretation of standardised interactions on a standardised dependent variable?

Well, fortunately, you did not standardize the interaction term itself. And from your data example it does not apear that you standardized car either. Those are both good things. If you did standardize the interaction term, it would be nearly impossible to interpret. The interaction term here means the same as it would in a model with any other interacted variables: A 1 SD difference in B (equivalently, a unit increase in standardized B) is associated with an expected difference in car that is given by the formula:

Code:

expected difference in CAR = coefficient of index B + interaction coefficient # standardized starting value of A

In other words, the effect of B depends on the level of A in a linear way. This is the standard interpretation of interactions, and it is no different here. It's really a mouthful of words, and I think that unless you are dealing with technically sophisticated math-loving audiences, a picture is worth many thousands of words:

Code:

margins, at(indexA = (-2(0.5)2) indexB = (-2(0.5)2)) marginsplot, xdimension(indexB)

If you want to show the marginal effect of B as a function of A, that would be:

Code:

margins, dydx(indexB) at(indexA = (-2(1)2)) marginsplot

And does it add any value to my regression to have standardised variables?

That depends on what these indices are. If they have units of measurement of their own or are generally understood in your field, then standardizing them serves primarily to obfuscate your analysis and results. So if you're in the "I can't dazzle them with brilliance, so I'l baffle them with BS" then standardization is just the ticket. But if these indices are dimensionless and will be unfamiliar to audiences (perhaps they are homebrew measures), then standardization adds clarity.

Think of it this way, if, in a different context, I B were measured in liters, and the outcome in kilometers, and that the coefficient of B in the analysis is so many kilometers per liter, you would immediately grasp the meaning of that. But if I told you that I had standardized B, and that my coefficient was so many kilometers per standard deviation of B, you could not even begin to interpret that until you delved into my specific data to find out how many liters the standard deviation of B actually is. So standardization here just obfuscates things. By contrast if B is a "flurp score" and I tell you that the coefficient of B is so many kilometers per point on the flurp scale, you would have no idea what that means, because you don't know what flurp scores look like or what they mean. You wouldn't even have a rough sense of whether that is a big coefficient or a little one. But if I standardize B, and tell you that the coefficient is so many kilometers per standard deviation on the flurp scale, you at least have a sense that a difference from 0.5 SD below to 0.5 SD above the mean, assuming flurp scores are normally distributed more or less, approximately corresponds to the difference between the 70th and 30th percentiles. So here the standardization has the effect of taking you from a completely incomprehensible and inherently meaningless measurement to locations on a normal curve, a sense of the scale of things. So here standardization improves things.

am I more likely to get significant results if I standardise my variables?

If by "significant" you mean statistically significant, the answer is no. Standardization of the variables has no effect at all on p-values. That said, you should not be choosing your analytic approach based on what gives you statistically significant results. That's not science; it's scientific misconduct.
1 like
Comment
Henry Arauz

Join Date: Jul 2018

Posts: 13
#3

17 Jul 2018, 19:20

Very helpful answer, Clyde. The margins command will be particularly helpful. Thanks again for your time and work.
Comment
Brad Anderson

Join Date: Sep 2014

Posts: 70
#4

18 Jul 2018, 07:15

One thing I'm add is that standardization actually does have an effect on the p-values for the main effects since the effect is being estimated at the mean (which is 0 with standardization) rather than at a 0. E.g., Now, with a significant interaction, the test of significance for the main effects are generally not of interest anyway. The test of significance for the interaction term will be the same for standardized and unstandarized variables.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#5

18 Jul 2018, 09:58

Brad Anderson is correct. In the rare circumstance where the statistical significance of the "main" effects in the interaction model of interest in their own right, those p-values do change, for the reason he gives. However, the more meaningful p-values associated with the joint significance of the main effects and the interaction, or of either main effect and the interaction are not changed.
Comment

Announcement

Interpretation of interaction terms in models with standardised variables

Comment

Comment

Comment

Comment