
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Including an interaction term vs running a regression with a portion of the dataset

    I ran an ordered probit regression with the following equation:

    (Eq.1) Y = B0 + B1 X1 + B2 Not_Poor + Other controls

    Where Y is a categorical variable, X1 is a continous variable and Not_Poor is dichotomous variable that takes the value of 1 if an individual is not poor and 0 otherwise.

    I got and insignificant B1. Theory suggest, however, that relationship between X1 and Y should be stronger among the poor. To test this, I have contemplated two methods.

    A) Keeping only the poor individuals of my dataset (i.e., running the line "keep if Not_Poor==0") and running Eq.1 again.

    B) Adding an interaction term between X1 and Not_Poor as follows: Y = B0 + B1 X1 + B2 Not_Poor + B3 (Not_Poor * X1) + Controls.

    In both cases, B1 is the effect of X1 on Y among the poor. However, I find that B1 is non-significant with method A and significant with method B.

    My questions are:

    i) What could explain this discrepancy?
    ii) Under which circumstances should method A be prefered to method B and viceversa?

    Thank you in advance!
    Last edited by Santiago Valdivieso; 06 Mar 2022, 15:43.

  • #2
    1) different regression specifications give back different coefficients; no wonder about that;
    2) the approach B is the way to go, as interaction gives you an idea of the effect of X1 on rich and poor, something that you cannot get with approach A.
    Kind regards,
    (StataNow 18.5)


    • #3
      Thanks Carlo Lazzaro !

