Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • multiple regression with interaction effect, but which variables should interact?

    Hi there,

    I've been searching for an answer on the internet but I haven't found anything clear yet. Maybe someone here can help me.

    Let's say I have a multiple regression that I should perform for 2 groups separately.

    I want to know if the effect of mpg on price is different for the domestic group vs the foreign group. Thus I should make 1 regression and use an interaction effect of mpg and foreign.

    How do I know if I should also interact foreign with weight in the regression?

    Thank you in advance

    Code:
    sysuse auto, clear
    
    regress price mpg weight if foreign==0, robust
    
    regress price mpg weight if foreign==1, robust
    Last edited by Sandra Bloem; 20 Jul 2020, 18:40.

  • #2
    Sandra:
    just go:
    Code:
    regress price c.mpg##i.foreign weight, robust
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo Lazzaro

      Thank you for the quick respond

      Okay. Just for confirmation, no need to do this right? Does this hold true even when the interaction term of weight * foreign is significant?

      Code:
      regress price c.mpg##i.foreign c.weight##i.foreign, robust

      Comment


      • #4
        Sandra:
        you can also include an interaction between -weight- and -foreign-.
        However, too many interactions make your results more difficult to explain.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Okay. Thank you for clarifying. Could you help me with these last 3 questions? I think I almost get it.

          1) The disadvantage of using both IT is complexity. What would be the advantage of using both? More accuracy? With both IT the effect of mpg on price is significantly different for the domestic versus the foreign group with my data, but with only mpg * foreign it is not. Is it better to use both in my case or are both options possible as long as I explain my reasoning (for my thesis)?

          2) Let's say I use both. Is is true that to test for a significant difference in effect of mpg on price between the domestic and foreign group, I should do -testparm c.mpg#1.foreign-? Or is it enough to just look at the p-value in the regression table of this coefficient? (Or are both good ways?)

          Code:
          testparm c.mpg#1.foreign

          3) Let's say mpg is not a continuous variable but a categorical variable with 5 categories, 1 - 5. To test for the same significant difference in effect of the mpg variable as a whole (so not each category separately), should I do -testparm i(2/5).mpg#1.foreign-? (category 1 as reference)

          Code:
          testparm i(2/5).mpg#1.foreign

          Comment


          • #6
            Sandra:
            1) the goal of any regression model should be to give a true and fair representaion of the data generating process, Statistical significant is (and should) not be a goal.
            2) both will give you the same result;
            3) correct.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Carlo Lazzaro

              Thank you so much!

              Comment

              Working...
              X