Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction term and lincom

    Hi all,

    I am running a basic interaction model y=b0 + b1X1 + b2D + b3X1*D. where D is a dummy variable. Here b1 is the effect of X1 for the group that has D=0 and (b1+b3)X1 for D=1. Now the confusing part is the significance of coefficients. I have by negative and significant and b3 positive but not significant which indicates there is no slope difference between two groups i.e. the effect of X1 is negative and significant for D=1 as well. However, when using lincom b1 + b3 to get the slope of X1 in D=1 group I get a number that is not statistically significant. How is this possible to have interaction term that is not significant but at the same time have a summation that is also not significant.
    Thanks

  • #2
    This kind of confusion is one of the many excellent reasons that the American Statistical Association now recommend abandoning the term "statistically significant" and its derivatives. Effect sizes should be presented along with estimates of their uncertainties, such as confidence intervals or standard errors, and where useful, with p-values. But dichotomizing p-values at 0.05 or other cutoffs and calling some results "significant" and others "not" is no longer recommended. See https://www.tandfonline.com/doi/full...5.2019.1583913 for the full article. A briefer "pep talk" on the same topic is available at https://www.nature.com/articles/d41586-019-00857-9.

    The confusion you are experiencing is due to the artificial nature of statistical significance and the inappropriateness of reasoning in binary terms about what are in reality continuous random variables (p-values).

    and b3 positive but not significant which indicates there is no slope difference between two groups i.e. the effect of X1 is negative and significant for D=1 as well
    Even if you choose to linger in the bad old days of dichotomous statistical significance, this statement is simply wrong. Neither the conclusion that there is no slope difference between the two groups, nor the conclusion that the effect of X1 is negative and significant for D = 1 as well is correct even when using statistical significance "correctly." It is, however, precisely because using statistical significance "correctly" is both very difficult for human brains to do consistently, and so rarely done in practice, that the American Statistical Association now recommends we abandon the practice.

    Comment


    • #3
      Hi Clyde,

      Thank you for the response and most importantly thank you for the links, I had not seen this discussion before and it was very useful reading.

      For the sake of clarification, if I were to behave stubbornly and choose to linger in the bad old days could you please explain why my conclusions are incorrect. Lincom command of b1 + b3 would test weather the slope of X1 for D=1 is zero or statistically different from zero. In my case I get a p-value that indicates I cannot reject H0 and thus I can conclude that X1 has no effect for D=1 group. Additionally, b1 is negative and significant which means that X1 has negative effect on Y for D=0, while b3 that tests weather there is slope difference between two groups indicates that the difference is not statistically significant. Thus one should conclude that X1 has the same effect for both groups. My question was how is it possible to, on the one hand conclude that the effect for both groups is negative, on the other hand have a slope coefficient for the D=1 that is not different from zero.

      I would really appreciate if you could elaborate a bit more on why this type of interpretation is incorrect (conditional on bad old day take on the issue)

      Thanks
      Last edited by Hovhannes Nahapetyan; 09 Apr 2019, 17:22.

      Comment


      • #4
        Lincom command of b1 + b3 would test weather the slope of X1 for D=1 is zero or statistically different from zero.
        This is the place where you go wrong (and where so many users of the concept of statistical significance go wrong), and everything from that point on is also wrong. The mistake is taking a non-significant result to mean that the slope is zero. It does not mean that. It means that the data are unable to identify the magnitude or sign of the coefficient with sufficient precision to determine whether it is zero or not. That's the real meaning of statistical significance.

        When you understand it that way, you realize that the "non-significance" of the interaction coefficient does not mean that the slope for D = 1 is the same as the slope for D = 0. It just means we can't tell if they are different or not. And it therefore also does not imply that slope for D = 1 is "significant and negative" just because that of D = 0 is. It could differ from the D = 0 slope in either direction by an appreciable amount.

        Comment


        • #5
          Thank you for your detail responses Clyde.

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            This is the place where you go wrong (and where so many users of the concept of statistical significance go wrong), and everything from that point on is also wrong. The mistake is taking a non-significant result to mean that the slope is zero. It does not mean that. It means that the data are unable to identify the magnitude or sign of the coefficient with sufficient precision to determine whether it is zero or not. That's the real meaning of statistical significance.

            When you understand it that way, you realize that the "non-significance" of the interaction coefficient does not mean that the slope for D = 1 is the same as the slope for D = 0. It just means we can't tell if they are different or not. And it therefore also does not imply that slope for D = 1 is "significant and negative" just because that of D = 0 is. It could differ from the D = 0 slope in either direction by an appreciable amount.
            According to this answers, I need to know whether there is a significant difference between the IV and MV of mine by using lincom command I got the following:
            HTML Code:
             lincom log_ID_pay - ( consh_id)
            
             ( 1)  [restate_dum]log_ID_pay - [restate_dum]consh_id = 0
            
            ------------------------------------------------------------------------------
             restate_dum |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                     (1) |   -.323614   .1418672    -2.28   0.023    -.6016687   -.0455593
            ------------------------------------------------------------------------------
            where consh_id is the MV (log_ID_pay*controlling shareholders).
            I am concerned to know whether my OLS model (including interaction) ensure the significant level of consh_id. So, I used the lincom command as above.
            So, can I say that my OLS ensures the significant level of my MV?
            please @Clyde Schechter

            Comment


            • #7
              I do not follow what you are doing here. I might be able to be more helpful if you show the exact regression command you ran and the complete output of the regression, along with an explanation of the relevant variables.

              Comment


              • #8
                Originally posted by ALKEBSEE RADWAN View Post

                According to this answers, I need to know whether there is a significant difference between the IV and MV of mine by using lincom command I got the following:
                HTML Code:
                 lincom log_ID_pay - ( consh_id)
                
                ( 1) [restate_dum]log_ID_pay - [restate_dum]consh_id = 0
                
                ------------------------------------------------------------------------------
                restate_dum | Coef. Std. Err. z P>|z| [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                (1) | -.323614 .1418672 -2.28 0.023 -.6016687 -.0455593
                ------------------------------------------------------------------------------
                where consh_id is the MV (log_ID_pay*controlling shareholders).
                I am concerned to know whether my OLS model (including interaction) ensure the significant level of consh_id. So, I used the lincom command as above.
                So, can I say that my OLS ensures the significant level of my MV?
                please @Clyde Schechter
                ok
                first this is the main command
                HTML Code:
                logit restate_dum log_ID_pay consh_id av_id_age n_of_fem_id_ num_of_id_job_experience num_of_id_degree_expert num_of_id_financial_expertise num_of_id_major_expertise board_size ceo_duality board_indp number_of_board_meetings firm_size2 leverage roa btm1 tobinq1 big4 is_there_defect_ soe f_onwer_contius cross_listed_all region_status controlling_share_dum i.Sic_g
                consh_id is the interaction variable in the model (log_ID_pay*controlling shareholders).

                I got a comment from a reviewer saying (As there is an interaction effect in you OLS model, how does the way for OLS ensure that the statistical significance of its parameters is correct under the current model? ).

                Here, I got an advice to you check the statistical significance between no interaction vs. interaction.
                So, I ran the following commands
                HTML Code:
                mat list e(b)
                lincom log_ID_pay - (consh_id)
                then I got the following outcomes
                HTML Code:
                ( 1)  [restate_dum]log_ID_pay - [restate_dum]consh_id = 0
                
                ------------------------------------------------------------------------------
                 restate_dum |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                         (1) |   -.323614   .1418672    -2.28   0.023    -.6016687   -.0455593
                ------------------------------------------------------------------------------
                please So, can I say that my OLS ensures the significant level of my MV?

                Comment


                • #9
                  If consh_id is the interaction variable, then I don't see that your -lincom- makes sense. You just need to look at the output in the regression table for consh_id itself. The reviewer wants you to say whether it is statistically significant or not. There is nothing more to say or do about that.

                  My discussion earlier in the thread is based on my rejection of statistical significance as a valid way to approach these questions. But you have a reviewer asking you to do that, and I am not in a position to put you into a quarrel with the reviewer over basic principles of statistics. You have asked what you need to do in response to the reviewer: you need to report only the statistical significance of the consh_id variable in your regression output. The question of whether that is sensible, meaningful, correct, or even sane is a different, and very long discussion.

                  Comment


                  • #10
                    Originally posted by Clyde Schechter View Post
                    If consh_id is the interaction variable, then I don't see that your -lincom- makes sense. You just need to look at the output in the regression table for consh_id itself. The reviewer wants you to say whether it is statistically significant or not. There is nothing more to say or do about that.

                    My discussion earlier in the thread is based on my rejection of statistical significance as a valid way to approach these questions. But you have a reviewer asking you to do that, and I am not in a position to put you into a quarrel with the reviewer over basic principles of statistics. You have asked what you need to do in response to the reviewer: you need to report only the statistical significance of the consh_id variable in your regression output. The question of whether that is sensible, meaningful, correct, or even sane is a different, and very long discussion.
                    Actually the coefficient of consh_id is highly significant at p<0.01.
                    I don't know how to respond to this comment.

                    Comment


                    • #11
                      So your response to the reviewer should be that the modification of the log pay effect by controlling shareholders is statistically significant. Now, since you are not using factor-variable notation in your regression, I cannot be certain what kind of variable controlling shareholders is. I will infer that because you have a variable called controlling_shareholders_dum, that it is dichotomous, with values 0 and 1. So you should now report two separate effects of log pay, one for each value of controlling shareholders. The effect (in log odds ratio metric) of log pay when controlling_shareholders_dum == 0 is just the coefficient of log_ID_pay. To get the effect of log pay when controlling_shareholders_dum == 1, you would use -lincom- as follows:
                      Code:
                      lincom log_ID_pay + consh_id

                      Comment


                      • #12
                        Alkebsee:
                        Clyde's very detailed reply made me reconsider my previous reply about -lincom- (please see https://www.statalist.org/forums/for...raction-effect, #11).
                        Clyde is indeed correct when he recommends to take a look at the regression outcome table, as -lincom- can only give back the very same results if coded up correctly.
                        The following toy-example converts into numbers what above (-logit-):
                        Code:
                        use "C:\Program Files\Stata17\ado\base\a\auto.dta"
                        . logit foreign c.price##i.rep78
                        
                        note: 1.rep78 != 0 predicts failure perfectly;
                              1.rep78 omitted and 2 obs not used.
                        
                        note: 2.rep78 != 0 predicts failure perfectly;
                              2.rep78 omitted and 8 obs not used.
                        
                        note: 5.rep78 omitted because of collinearity.
                        note: 2.rep78#c.price omitted because of collinearity.
                        note: 5.rep78#c.price omitted because of collinearity.
                        Iteration 0:   log likelihood = -38.411464 
                        Iteration 1:   log likelihood = -26.721804 
                        Iteration 2:   log likelihood = -25.836528 
                        Iteration 3:   log likelihood =   -25.6437 
                        Iteration 4:   log likelihood = -25.639654 
                        Iteration 5:   log likelihood = -25.639649 
                        Iteration 6:   log likelihood = -25.639649 
                        
                        Logistic regression                                     Number of obs =     59
                                                                                LR chi2(5)    =  25.54
                                                                                Prob > chi2   = 0.0001
                        Log likelihood = -25.639649                             Pseudo R2     = 0.3325
                        
                        -------------------------------------------------------------------------------
                              foreign | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
                        --------------+----------------------------------------------------------------
                                price |   .0013649   .0012935     1.06   0.291    -.0011703       .0039
                                      |
                                rep78 |
                                   1  |          0  (empty)
                                   2  |          0  (empty)
                                   3  | 4.211232   5.946067     0.71   0.479    -7.442844    15.86531
                                   4  |   4.126661   5.945849     0.69   0.488    -7.526989    15.78031
                                   5  |          0  (omitted)
                                      |
                        rep78#c.price |
                                   1  |          0  (empty)
                                   2  |          0  (empty)
                                   3  |  -.0016261   .0013396    -1.21   0.225    -.0042518    .0009995
                                   4  |  -.0012258   .0013252    -0.92   0.355    -.0038232    .0013716
                                   5  |          0  (omitted)
                                      |
                                _cons |  -4.970512   5.663738    -0.88   0.380    -16.07124    6.130211
                        -------------------------------------------------------------------------------
                        
                        . mat list e(b)
                        
                        e(b)[1,12]
                               foreign:    foreign:    foreign:    foreign:    foreign:    foreign:    foreign:    foreign:    foreign:    foreign:    foreign:
                                                1b.         2o.          3.          4.         5o.   1b.rep78#   2o.rep78#    3.rep78#    4.rep78#   5o.rep78#
                                 price       rep78       rep78       rep78       rep78       rep78    co.price    co.price     c.price     c.price    co.price
                        y1   .00136488           0           0   4.2112319   4.1266608           0           0           0  -.00162614  -.00122579           0
                        
                               foreign:
                                      
                                 _cons
                        y1  -4.9705124
                        
                        . lincom price
                        
                         ( 1)  [foreign]price = 0
                        
                        ------------------------------------------------------------------------------
                             foreign | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
                        -------------+----------------------------------------------------------------
                                 (1) |   .0013649   .0012935     1.06   0.291    -.0011703       .0039
                        ------------------------------------------------------------------------------
                        
                        .. . lincom (price+3.rep78#c.price )-price
                        
                         ( 1)  [foreign]3.rep78#c.price = 0
                        
                        ------------------------------------------------------------------------------
                             foreign | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
                        -------------+----------------------------------------------------------------
                                 (1) |  -.0016261   .0013396    -1.21   0.225    -.0042518    .0009995
                        ------------------------------------------------------------------------------
                        
                        .
                        
                        
                        .
                        
                        .
                        Kind regards,
                        Carlo
                        (StataNow 18.5)

                        Comment


                        • #13
                          Originally posted by Clyde Schechter View Post
                          So your response to the reviewer should be that the modification of the log pay effect by controlling shareholders is statistically significant. Now, since you are not using factor-variable notation in your regression, I cannot be certain what kind of variable controlling shareholders is. I will infer that because you have a variable called controlling_shareholders_dum, that it is dichotomous, with values 0 and 1. So you should now report two separate effects of log pay, one for each value of controlling shareholders. The effect (in log odds ratio metric) of log pay when controlling_shareholders_dum == 0 is just the coefficient of log_ID_pay. To get the effect of log pay when controlling_shareholders_dum == 1, you would use -lincom- as follows:
                          Code:
                          lincom log_ID_pay + consh_id
                          Thank you so much.
                          I got it.
                          But, what if the controlling shareholders is continues variable not binary ( another MV of mine is not dichotomous)?
                          How can I use it ? or if there is another way to check the differences Kindly guide me.

                          Thank you too much

                          Comment


                          • #14
                            Originally posted by Carlo Lazzaro View Post
                            Alkebsee:
                            Clyde's very detailed reply made me reconsider my previous reply about -lincom- (please see https://www.statalist.org/forums/for...raction-effect, #11).
                            Clyde is indeed correct when he recommends to take a look at the regression outcome table, as -lincom- can only give back the very same results if coded up correctly.
                            The following toy-example converts into numbers what above (-logit-):
                            Code:
                            use "C:\Program Files\Stata17\ado\base\a\auto.dta"
                            . logit foreign c.price##i.rep78
                            
                            note: 1.rep78 != 0 predicts failure perfectly;
                            1.rep78 omitted and 2 obs not used.
                            
                            note: 2.rep78 != 0 predicts failure perfectly;
                            2.rep78 omitted and 8 obs not used.
                            
                            note: 5.rep78 omitted because of collinearity.
                            note: 2.rep78#c.price omitted because of collinearity.
                            note: 5.rep78#c.price omitted because of collinearity.
                            Iteration 0: log likelihood = -38.411464
                            Iteration 1: log likelihood = -26.721804
                            Iteration 2: log likelihood = -25.836528
                            Iteration 3: log likelihood = -25.6437
                            Iteration 4: log likelihood = -25.639654
                            Iteration 5: log likelihood = -25.639649
                            Iteration 6: log likelihood = -25.639649
                            
                            Logistic regression Number of obs = 59
                            LR chi2(5) = 25.54
                            Prob > chi2 = 0.0001
                            Log likelihood = -25.639649 Pseudo R2 = 0.3325
                            
                            -------------------------------------------------------------------------------
                            foreign | Coefficient Std. err. z P>|z| [95% conf. interval]
                            --------------+----------------------------------------------------------------
                            price | .0013649 .0012935 1.06 0.291 -.0011703 .0039
                            |
                            rep78 |
                            1 | 0 (empty)
                            2 | 0 (empty)
                            3 | 4.211232 5.946067 0.71 0.479 -7.442844 15.86531
                            4 | 4.126661 5.945849 0.69 0.488 -7.526989 15.78031
                            5 | 0 (omitted)
                            |
                            rep78#c.price |
                            1 | 0 (empty)
                            2 | 0 (empty)
                            3 | -.0016261 .0013396 -1.21 0.225 -.0042518 .0009995
                            4 | -.0012258 .0013252 -0.92 0.355 -.0038232 .0013716
                            5 | 0 (omitted)
                            |
                            _cons | -4.970512 5.663738 -0.88 0.380 -16.07124 6.130211
                            -------------------------------------------------------------------------------
                            
                            . mat list e(b)
                            
                            e(b)[1,12]
                            foreign: foreign: foreign: foreign: foreign: foreign: foreign: foreign: foreign: foreign: foreign:
                            1b. 2o. 3. 4. 5o. 1b.rep78# 2o.rep78# 3.rep78# 4.rep78# 5o.rep78#
                            price rep78 rep78 rep78 rep78 rep78 co.price co.price c.price c.price co.price
                            y1 .00136488 0 0 4.2112319 4.1266608 0 0 0 -.00162614 -.00122579 0
                            
                            foreign:
                            
                            _cons
                            y1 -4.9705124
                            
                            . lincom price
                            
                            ( 1) [foreign]price = 0
                            
                            ------------------------------------------------------------------------------
                            foreign | Coefficient Std. err. z P>|z| [95% conf. interval]
                            -------------+----------------------------------------------------------------
                            (1) |  .0013649 .0012935 1.06 0.291 -.0011703 .0039
                            ------------------------------------------------------------------------------
                            
                            .. . lincom (price+3.rep78#c.price )-price
                            
                            ( 1) [foreign]3.rep78#c.price = 0
                            
                            ------------------------------------------------------------------------------
                            foreign | Coefficient Std. err. z P>|z| [95% conf. interval]
                            -------------+----------------------------------------------------------------
                            (1) | -.0016261 .0013396 -1.21 0.225 -.0042518 .0009995
                            ------------------------------------------------------------------------------
                            
                            .
                            
                            
                            .
                            
                            .
                            I see.

                            yes the coefficient is not sig in this example
                            thank you very much

                            Comment


                            • #15
                              Carlo Lazzaro AND Clyde Schechter
                              I have rerun the licom with plus instead of minus, the outcome has changed as numbers and remained as significance level
                              HTML Code:
                              ------------------------------------------------------------------------------
                               restate_dum |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              -------------+----------------------------------------------------------------
                                       (1) |  -.3776373   .0527273    -7.16   0.000    -.4809809   -.2742937
                              ------------------------------------------------------------------------------
                              Is there a significant difference between the two equations (+ and -) ? regarding the null hypothesis and its effect on our decision ?
                              Best
                              Alkebsee

                              Comment

                              Working...
                              X