Puzzling results for interaction term in pooled and unpooled analyses

Mattia Gatti

Join Date: May 2023
Posts: 42

Puzzling results for interaction term in pooled and unpooled analyses

16 Oct 2024, 10:33

Dear all,

I am having troubles understanding the reasons why I get different results for my interactions in a pooled and the two unpooled models.

This is an example of my stacked dataset with two obs for each x7.

Code:

input float(x1 x2) double(x3 x4) float x5 byte x6 str20 x7 float y
      2.5        4   6.1 0 21 2 "AGALEV"    4.818182
        1        4   6.1 0 21 1 "AGALEV"    6.416667
        3        4   6.1 0 21 0 "AGALEV"        6.25
      2.5 6.636364  8.89 1 21 2 "CDV"      4.7272725
        3 6.636364  8.89 1 21 1 "CDV"       5.333333
      5.2 6.636364  8.89 1 21 0 "CDV"       6.833333
      2.5 3.909091  6.14 0 21 2 "ECOLO"     4.818182
        1 3.909091  6.14 0 21 1 "ECOLO"     6.272727
      3.4 3.909091  6.14 0 21 0 "ECOLO"         6.25
        2        7  7.56 1 21 2 "PRL/MR"    5.090909
      1.5        7  7.56 1 21 1 "PRL/MR"   4.4545455
      2.4        7  7.56 1 21 0 "PRL/MR"    8.083333
     2.75      7.5  9.46 0 21 2 "PSBE"     4.2727275
      2.5      7.5  9.46 0 21 1 "PSBE"             4
      2.2      7.5  9.46 0 21 0 "PSBE"      8.166667
2.3333333      6.7   3.7 0 21 2 "PSC/CDH"  4.3636365
        3      6.7   3.7 0 21 1 "PSC/CDH"          5
      4.4      6.7   3.7 0 21 0 "PSC/CDH"   6.416667
 2.666667    8.125  8.62 0 10 2 "PVDA-PTB"         4
        3    8.125  8.62 0 10 1 "PVDA-PTB"       3.4
       .6    8.125  8.62 0 10 0 "PVDA-PTB"  8.833333
      2.5 6.818182  6.71 0 21 2 "SP/SPA"    4.181818
      2.5 6.818182  6.71 0 21 1 "SP/SPA"   4.5833335
      2.2 6.818182  6.71 0 21 0 "SP/SPA"    7.916667
     2.75      9.2 11.95 0 21 2 "VB"       4.7272725
        2      9.2 11.95 0 21 1 "VB"        6.916667
      3.6      9.2 11.95 0 21 0 "VB"               4
      1.5 6.636364  8.54 1 21 2 "VLD/PVV"   5.090909
      1.5 6.636364  8.54 1 21 1 "VLD/PVV"   5.166667
      2.6 6.636364  8.54 1 21 0 "VLD/PVV"   8.166667
     3.75 8.090909 16.03 0 21 2 "VU/NVA"   4.2727275
        5 8.090909 16.03 0 21 1 "VU/NVA"    5.416667
      2.8 8.090909 16.03 0 21 0 "VU/NVA"         7.5
2.5714285 4.076923     3 0  1 2 "A"         4.285714
 .6666667 4.076923     3 0  1 1 "A"         7.857143
        3 4.076923     3 0  1 0 "A"        3.4285715
1.4444444 9.357142   8.7 0 21 2 "DF"        6.785714
1.3333334 9.357142   8.7 0 21 1 "DF"        7.928571
        6 9.357142   8.7 0 21 0 "DF"        4.142857
        4 3.857143   6.9 0 21 2 "ELDK"      5.769231
1.6666666 3.857143   6.9 0 21 1 "ELDK"      7.142857
        1 3.857143   6.9 0 21 0 "ELDK"      7.357143
 3.333333 7.285714   6.6 0 21 2 "KF"        4.714286
2.3333333 7.285714   6.6 0 21 1 "KF"        5.642857
1.6666666 7.285714   6.6 0 21 0 "KF"        7.357143
    3.875 8.571428   2.3 0 10 2 "LA"        3.642857
 5.333333 8.571428   2.3 0 10 1 "LA"        4.428571
        1 8.571428   2.3 0 10 0 "LA"        8.428572
        2 9.214286   2.4 0  1 2 "NB"        4.714286
        1 9.214286   2.4 0  1 1 "NB"        7.583333
        2 9.214286   2.4 0  1 0 "NB"        5.714286
1.1111112 7.071429   8.6 0 21 2 "RV"        7.142857
        1 7.071429   8.6 0 21 1 "RV"        7.928571
 3.333333 7.071429   8.6 0 21 0 "RV"               6
 3.111111 6.928571  25.9 1 21 2 "SDDK"      4.714286
        3 6.928571  25.9 1 21 1 "SDDK"           5.5
2.3333333 6.928571  25.9 1 21 0 "SDDK"      7.357143
 3.888889 4.857143   7.7 0 21 2 "SFDK"      4.571429
1.6666666 4.857143   7.7 0 21 1 "SFDK"      5.571429
1.6666666 4.857143   7.7 0 21 0 "SFDK"      6.142857
4.3333335 6.785714  23.4 0 21 2 "v"         5.214286
        6 6.785714  23.4 0 21 1 "v"              5.5
 2.666667 6.785714  23.4 0 21 0 "v"         6.714286
 2.909091        7  12.6 0  6 2 "AfD"       6.714286
 3.166667        7  12.6 0  6 1 "AfD"       9.428572
 6.666667        7  12.6 0  6 0 "AfD"       3.190476
 2.636364 8.315789  26.8 1 21 2 "CDUGE"     6.857143
 5.666667 8.315789  26.8 1 21 1 "CDUGE"     6.238095
4.5833335 8.315789  26.8 1 21 0 "CDUGE"     6.476191
 3.636364 8.315789   6.2 1 21 2 "CSU"            6.6
      3.4 8.315789   6.2 1 21 1 "CSU"       7.619048
 3.090909 8.315789   6.2 1 21 0 "CSU"       6.523809
        .        .     . 0  6 2 "DieTier"          4
        3        .     . 0  6 1 "DieTier"          7
        2        .     . 0  6 0 "DieTier"        2.5
 3.727273 8.263158  10.7 0 21 2 "FDP"       5.666667
 3.083333 8.263158  10.7 0 21 1 "FDP"       5.047619
1.8333334 8.263158  10.7 0 21 0 "FDP"            8.1
1.4545455 4.842105   8.9 0 21 2 "GRUNEN"    7.333333
      1.5 4.842105   8.9 0 21 1 "GRUNEN"    8.476191
     3.75 4.842105   8.9 0 21 0 "GRUNEN"           5
 5.090909 6.055555   9.2 0 21 2 "LINKE"         4.85
4.3333335 6.055555   9.2 0 21 1 "LINKE"     5.238095
1.9166666 6.055555   9.2 0 21 0 "LINKE"     8.095238
        0      1.5     . 0  6 2 "Piraten"       5.25
1.6666666      1.5     . 0  6 1 "Piraten"        8.6
 2.666667      1.5     . 0  6 0 "Piraten"        2.2
2.3636363 5.578948  20.5 1 21 2 "SPD"       6.857143
4.1666665 5.578948  20.5 1 21 1 "SPD"       5.476191
      5.5 5.578948  20.5 1 21 0 "SPD"       7.666667
      1.4 9.666667   3.7 0  1 2 "EL"           4.375
        1 9.666667   3.7 0  1 1 "EL"            8.25
 2.666667 9.666667   3.7 0  1 0 "EL"           4.375
      1.5 6.666667   .74 0  1 2 "KIDISO"    7.142857
        2 6.666667   .74 0  1 1 "KIDISO"        6.75
        2 6.666667   .74 0  1 0 "KIDISO"         7.2
.16666667        9   5.3 0 21 2 "KKE"       6.222222
       .5        9   5.3 0 21 1 "KKE"           3.25
        0        9   5.3 0 21 0 "KKE"       9.111111
        1      5.5  3.44 0  1 2 "MR25"         7.625
end

The following is the model, output, and marginsplot of my pooled analysis:

Code:

eststo seven: reg y c.x1##c.x2 x3 x4 x5 i.country if x6!=2 & in_model_1==1, vce(cl x7)
margins, dydx(x1) at(x2=(0(0.5)10))
marginsplot, title("Average Marginal Effects of x1 on y (95% CIs)") xtitle("x2") ///
addplot(histogram x2 if x6!=2, freq width(0.5) yaxis(2) yscale(alt axis(2)) fcolor(%25) lc(black%50))

Code:

Linear regression                               Number of obs     =        258
                                                F(20, 129)        =      11.74
                                                Prob > F          =     0.0000
                                                R-squared         =     0.3411
                                                Root MSE          =     1.2825

                                   (Std. err. adjusted for 130 clusters in x7)
------------------------------------------------------------------------------
             |               Robust
           y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x1 |   .0019245   .2214274     0.01   0.993     -.436175    .4400241
          x2 |   .0689307   .0806277     0.85   0.394    -.0905932    .2284545
             |
   c.x1#c.x2 |  -.0763485   .0301067    -2.54   0.012    -.1359155   -.0167816
             |
          x3 |   .0439528   .0079439     5.53   0.000     .0282355      .05967
          x4 |   .0707958    .146916     0.48   0.631    -.2198811    .3614728
          x5 |   .0015976   .0101892     0.16   0.876     -.018562    .0217572
             |
     country |
      2. dk  |   .1339352   .2632911     0.51   0.612    -.3869927     .654863
      3. ge  |   .8146273   .2339428     3.48   0.001     .3517659    1.277489
      4. gr  |   .5158083   .3239174     1.59   0.114    -.1250702    1.156687
     5. esp  |   .0285887   .2713843     0.11   0.916    -.5083519    .5655292
      6. fr  |   .7092217   .2187707     3.24   0.002     .2763785    1.142065
     7. irl  |    -.68028   .2382757    -2.86   0.005    -1.151714   -.2088458
      8. it  |   .2901257   .4193663     0.69   0.490    -.5396007    1.119852
     10. nl  |   -.420152   .1813169    -2.32   0.022    -.7788919   -.0614122
     11. uk  |   -.787641   .2225044    -3.54   0.001    -1.227871   -.3474106
    12. por  |   .7449562   .3259486     2.29   0.024     .1000589    1.389854
    13. aus  |  -.8840575   .3138242    -2.82   0.006    -1.504966   -.2631485
    14. fin  |   .6545474   .1682645     3.89   0.000     .3216321    .9874628
     16. sv  |   .4814474    .216804     2.22   0.028     .0524953    .9103995
    38. lux  |  -1.392723   .4609494    -3.02   0.003    -2.304723   -.4807235
             |
       _cons |   6.653327   .6476639    10.27   0.000     5.371908    7.934746
------------------------------------------------------------------------------

Click image for larger version

Name: Graphstatalist1.png
Views: 1
Size: 127.4 KB
ID: 1765872

Unfortunately, when I run the analyses for the two values of x6 I am interested in, the results do not add up. Both are not significant, and while the model for x6=0 is in the expected direction, the p-value is barely <0.10. How can these results be explained?

Code:

eststo seven2: reg y c.x1##c.x2 x3 x4 x5 i.country if x6==0 & in_model_2==1
margins, dydx(x1) at(x2=(0(0.5)10))
marginsplot, title("Average Marginal Effects of x1 on y (95% CIs)") xtitle("x2") ///
addplot(histogram x2 if x6==0 & in_model_2==1, freq width(0.5) yaxis(2) yscale(alt axis(2)) fcolor(%25) lc(black%50))

eststo seven3: reg y c.x1##c.x2 x3 x4 x5 i.country if x6==1 & in_model_3==1
margins, dydx(x1) at(x2=(0(0.5)10))
marginsplot, title("Average Marginal Effects of x1 on y (95% CIs)") xtitle("x2") ///
addplot(histogram x2 if x6==1 & in_model_3==1, freq width(0.5) yaxis(2) yscale(alt axis(2)) fcolor(%25) lc(black%50))

Code:

eststo seven2: reg y c.x1##c.x2 x3 x4 x5 i.country if x6==0 & in_model_2==1
      Source |       SS           df       MS      Number of obs   =       128
-------------+----------------------------------   F(20, 107)      =      4.76
       Model |  150.651879        20  7.53259395   Prob > F        =    0.0000
    Residual |  169.297938       107  1.58222372   R-squared       =    0.4709
-------------+----------------------------------   Adj R-squared   =    0.3720
       Total |  319.949817       127   2.5192899   Root MSE        =    1.2579

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x1 |   .1074972   .3360307     0.32   0.750    -.5586445    .7736388
          x2 |   .0709244    .180704     0.39   0.695    -.2873001     .429149
             |
   c.x1#c.x2 |    -.08677   .0495838    -1.75   0.083    -.1850642    .0115242
             |
          x3 |   .0609723    .016123     3.78   0.000     .0290104    .0929342
          x4 |    .137091   .3109673     0.44   0.660    -.4793655    .7535475
          x5 |   .0312748   .0174638     1.79   0.076    -.0033452    .0658947
             |
     country |
      2. dk  |  -.8865927   .5647997    -1.57   0.119    -2.006242    .2330568
      3. ge  |  -.4909954   .6170599    -0.80   0.428    -1.714245    .7322539
      4. gr  |   .0128348   .6165485     0.02   0.983    -1.209401     1.23507
     5. esp  |  -.3776846    .531973    -0.71   0.479    -1.432259    .6768899
      6. fr  |   .3147261   .5968196     0.53   0.599     -.868399    1.497851
     7. irl  |  -1.004265   .6580029    -1.53   0.130    -2.308679    .3001494
      8. it  |  -.3107438   .6299975    -0.49   0.623     -1.55964    .9381529
     10. nl  |  -1.235017   .5238642    -2.36   0.020    -2.273517   -.1965178
     11. uk  |   -2.06065   .6022026    -3.42   0.001    -3.254446   -.8668532
    12. por  |   .5106341   .6342881     0.81   0.423    -.7467681    1.768036
    13. aus  |   -1.83555    .699734    -2.62   0.010    -3.222691   -.4484088
    14. fin  |  -.5542002   .6047479    -0.92   0.362    -1.753042    .6446421
     16. sv  |  -.7864335   .5936533    -1.32   0.188    -1.963282     .390415
    38. lux  |  -1.986957   .6895806    -2.88   0.005     -3.35397   -.6199439
             |
       _cons |   6.848582   1.376672     4.97   0.000      4.11949    9.577673
------------------------------------------------------------------------------
eststo seven3: reg y c.x1##c.x2 x3 x4 x5 i.country if x6==1 & in_model_3==1

      Source |       SS           df       MS      Number of obs   =       130
-------------+----------------------------------   F(20, 109)      =      5.44
       Model |  132.957861        20  6.64789305   Prob > F        =    0.0000
    Residual |   133.32071       109  1.22312578   R-squared       =    0.4993
-------------+----------------------------------   Adj R-squared   =    0.4075
       Total |  266.278571       129  2.06417497   Root MSE        =     1.106

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x1 |  -.6511678   .3113023    -2.09   0.039    -1.268159   -.0341768
          x2 |  -.0603839    .112382    -0.54   0.592    -.2831215    .1623536
             |
   c.x1#c.x2 |   .0043718   .0439778     0.10   0.921    -.0827908    .0915343
             |
          x3 |   .0330859   .0147282     2.25   0.027     .0038952    .0622767
          x4 |  -.0551796   .2737978    -0.20   0.841    -.5978379    .4874787
          x5 |  -.0272743    .014766    -1.85   0.067      -.05654    .0019914
             |
     country |
      2. dk  |   1.199387   .4903972     2.45   0.016     .2274362    2.171339
      3. ge  |   2.207378     .54445     4.05   0.000     1.128296     3.28646
      4. gr  |   1.114636   .5351736     2.08   0.040     .0539394    2.175333
     5. esp  |   .4695195   .4680648     1.00   0.318    -.4581697    1.397209
      6. fr  |   1.346269   .5183206     2.60   0.011     .3189745    2.373564
     7. irl  |   -.153926   .5391264    -0.29   0.776    -1.222457     .914605
      8. it  |   1.005785   .5587891     1.80   0.075    -.1017168    2.113287
     10. nl  |        .46   .4622662     1.00   0.322    -.4561965    1.376197
     11. uk  |   .6379496   .5302325     1.20   0.232    -.4129541    1.688853
    12. por  |   1.022597   .5574282     1.83   0.069    -.0822071    2.127402
    13. aus  |  -.0046027   .6292805    -0.01   0.994    -1.251816    1.242611
    14. fin  |   2.056141   .5390641     3.81   0.000     .9877332    3.124548
     16. sv  |   1.785488   .5192244     3.44   0.001      .756402    2.814574
    38. lux  |  -.5289811   .6191384    -0.85   0.395    -1.756093     .698131
             |
       _cons |   7.331186   .8890873     8.25   0.000     5.569044    9.093328
------------------------------------------------------------------------------

Click image for larger version

Name: Graphstatalist2ec.png
Views: 1
Size: 125.5 KB
ID: 1765873

Click image for larger version

Name: Graphstatalist3cu.png
Views: 1
Size: 116.7 KB
ID: 1765874

Sincerely
Mattia

Tags: None

Daniel Schaefer

Join Date: Mar 2020

Posts: 806
#2

16 Oct 2024, 10:54

Whether the model is pooled or not pooled is not the only difference between these models. It also looks like you are estimating the models on different subsets of the data. The first model is on observations where x6 equals 0 or 1 (assuming x6 is never negative), the second on observations where x6 equals 0, and the third on observations where x6 equals 1. You also restrict the observations based on the three in model variables: in_model_1, in_model_2, and in_model_3. It could be that your sample sizes for the subsets for the last two models are too small to see the interaction, or the size of the effect could depend on the subset of observations used. It is unclear why this is happening based on the information you've provided here. You might try to isolate these differences between models and test them one at a time to try to see where the difference emerges. For example, if you run the first model exactly as it is but without the vce() option at the end, does the interaction change?
Comment
George Ford

Join Date: Aug 2014

Posts: 3036
#3

16 Oct 2024, 11:10

Many of the coefficients are different (signs, sizes, significance), including the fixed effects.
Comment
Mattia Gatti

Join Date: May 2023

Posts: 42
#4

17 Oct 2024, 04:30

Dear Daniel and George,

To clarify, I referred to unpooled models to indicate that I am comparing the (first) model on observations where x6 equals 0 or 1 (pooled) to the second and third models where I subset the data based on the value of x6 (either 0 or 1). Plus, x6 is a trichotomous variable (x6=2 is excluded from the analysis).
The in_model_* are actually useless conditions in this context, as they do not change the number of observations (the n of all the other models I run are based on the n presented here). Sorry for the additional confusion.

Thanks for your answers. I guess I got what you are saying. May the results and statistical significance of the subsets thus reflect the smaller sample sizes? My initial understanding was the following: if I find that an interaction effect exists and the effect of x1 on y is conditional on x2, then by subsetting the sample based on a meaningful variable (in this case x6), the sum of the two interaction effects, should provide me with something approximate to the pooled interaction effect. Alternatively, I would have expected that, if one of the two subsets would have returned me with a non-significant interaction coefficient (in this case when x6=1), then the other subset (in this case x6=0) would have returned me a much more statistically significant interaction coefficient, to "offset" the other subsample, ultimately "making sense" of the results of the first model.
In short, shall I not expect that?

Finally, I tried Daniel's suggestion and removed the vce() option, yet the confidence intervals and significance of the effect does not change much (still p<0.05).

Sincerely
Mattia
Comment
George Ford

Join Date: Aug 2014

Posts: 3036
#5

17 Oct 2024, 09:27

In the pooled model, you are imposing the constraint that all the coefficients are the same across x6. It is not uncommon for the coefficients to vary across subsamples (though you'd typically prefer they not).

If you just wanted to know whether x6 has an effect on the interaction, then you could interact x6 with x2*x1 (and maybe x1 too) in the pooled sample.
Comment
Mattia Gatti

Join Date: May 2023

Posts: 42
#6

18 Oct 2024, 09:12

Dear George,

thanks a lot, all clear. I will give it a try.

Sincerely
Mattia
Comment

Announcement

Puzzling results for interaction term in pooled and unpooled analyses

Comment

Comment

Comment

Comment

Comment