Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spurious Results? Model Specification with Interaction Term

    Hi,

    I have time series trading data on an aggregate level for two distinct groups of investors (group A and group B). I want to estimate the effect of variable X_t on their daily trading volume VOL_t. The two groups have been constructed based on a criterion that depends to a certain extent on a daily trading signal. This signal can be represented through a dummy variable D_t which is 1 if the signal is observed on day t and zero else. Investors of group A trade relatively frequently according to that signal and investors of group B just trade randomly according to that signal.

    My hypothesis is that explanatory variable X_t only has a positive effect on VOL_t for investors of group A on signal days (i.e., D_t = 1). It should have no effect on VOL_t for group A investors if D_t = 0. Also, it should have no (or at least a smaller) effect on VOL_t for group B investors.

    I have some issues to construct a model for this.

    My attempt was to fit a regression model for VOL_t of group A and a model for VOL_t of group B of the form:

    VOL_t = a + b1 * D_t + b2 * X_t + b3 * (D_t * X_t) + e_t

    where (D_t * X_t) is the interaction term between X_t and D_t.

    If I estimate this model separately for group A and group B trading volume, I get the results hypothesized above.


    Results for group A investors:
    Code:
                             |               Robust
                       VOL_A |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
                           D |    .776241   .0581441    13.35   0.000     .6619864    .8904956
                           X |   .0012703   .0045436     0.28   0.780     -.007658    .0101986
                   D_times_X |   .1556122   .0475769     3.27   0.001     .0621225    .2491019

    Results for group B investors:
    Code:
                             |               Robust
                       VOL_B |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
                           D |  -.1899324   .0453358    -4.19   0.000    -.2790081   -.1008567
                           X |   .0048146   .0095859     0.50   0.616    -.0140198     .023649
                   D_times_X |   .0717573   .0398264     1.80   0.072    -.0064935    .1500082
    However, I am not sure if this is really what is considered a good specification because VOL_t is correlated with D_t by construction for group A (even though it is quite low: corr(VOL_t,D_t) = 0.04 for group A and corr(VOL_t,D_t) = −0.01 for group B).

    So, my question is: Can I interpret a positive and significant coefficient b3 in the regression model for group A as an indication for higher trading volume if X_t is higher on days with D_t = 1? Or can this result be simply spurious due to the relation between VOL_t and D_t for group A?

    If that is the case: How could I do better regarding the design of the regression model?

    Thanks already for any input!
    Last edited by Rolf Miller; 21 Nov 2019, 19:06.

  • #2
    You didn't get a quick answer. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output (fuller output), and sample data using dataex. We don't even know exactly what model you ran - I would guess regression but that is just a guess.
    Why not do the entire estimation at one time and do a dummy interaction for the A and B groups? This will also make it easier to test parameter equality across the two groups. Also, you'll find it much easier to use factor variable notation for the interactions - they make it easy to use margins after the estimation.

    Comment

    Working...
    X