Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Puzzling results for interaction term in pooled and unpooled analyses

    Dear all,

    I am having troubles understanding the reasons why I get different results for my interactions in a pooled and the two unpooled models.

    This is an example of my stacked dataset with two obs for each x7.

    Code:
    input float(x1 x2) double(x3 x4) float x5 byte x6 str20 x7 float y
          2.5        4   6.1 0 21 2 "AGALEV"    4.818182
            1        4   6.1 0 21 1 "AGALEV"    6.416667
            3        4   6.1 0 21 0 "AGALEV"        6.25
          2.5 6.636364  8.89 1 21 2 "CDV"      4.7272725
            3 6.636364  8.89 1 21 1 "CDV"       5.333333
          5.2 6.636364  8.89 1 21 0 "CDV"       6.833333
          2.5 3.909091  6.14 0 21 2 "ECOLO"     4.818182
            1 3.909091  6.14 0 21 1 "ECOLO"     6.272727
          3.4 3.909091  6.14 0 21 0 "ECOLO"         6.25
            2        7  7.56 1 21 2 "PRL/MR"    5.090909
          1.5        7  7.56 1 21 1 "PRL/MR"   4.4545455
          2.4        7  7.56 1 21 0 "PRL/MR"    8.083333
         2.75      7.5  9.46 0 21 2 "PSBE"     4.2727275
          2.5      7.5  9.46 0 21 1 "PSBE"             4
          2.2      7.5  9.46 0 21 0 "PSBE"      8.166667
    2.3333333      6.7   3.7 0 21 2 "PSC/CDH"  4.3636365
            3      6.7   3.7 0 21 1 "PSC/CDH"          5
          4.4      6.7   3.7 0 21 0 "PSC/CDH"   6.416667
     2.666667    8.125  8.62 0 10 2 "PVDA-PTB"         4
            3    8.125  8.62 0 10 1 "PVDA-PTB"       3.4
           .6    8.125  8.62 0 10 0 "PVDA-PTB"  8.833333
          2.5 6.818182  6.71 0 21 2 "SP/SPA"    4.181818
          2.5 6.818182  6.71 0 21 1 "SP/SPA"   4.5833335
          2.2 6.818182  6.71 0 21 0 "SP/SPA"    7.916667
         2.75      9.2 11.95 0 21 2 "VB"       4.7272725
            2      9.2 11.95 0 21 1 "VB"        6.916667
          3.6      9.2 11.95 0 21 0 "VB"               4
          1.5 6.636364  8.54 1 21 2 "VLD/PVV"   5.090909
          1.5 6.636364  8.54 1 21 1 "VLD/PVV"   5.166667
          2.6 6.636364  8.54 1 21 0 "VLD/PVV"   8.166667
         3.75 8.090909 16.03 0 21 2 "VU/NVA"   4.2727275
            5 8.090909 16.03 0 21 1 "VU/NVA"    5.416667
          2.8 8.090909 16.03 0 21 0 "VU/NVA"         7.5
    2.5714285 4.076923     3 0  1 2 "A"         4.285714
     .6666667 4.076923     3 0  1 1 "A"         7.857143
            3 4.076923     3 0  1 0 "A"        3.4285715
    1.4444444 9.357142   8.7 0 21 2 "DF"        6.785714
    1.3333334 9.357142   8.7 0 21 1 "DF"        7.928571
            6 9.357142   8.7 0 21 0 "DF"        4.142857
            4 3.857143   6.9 0 21 2 "ELDK"      5.769231
    1.6666666 3.857143   6.9 0 21 1 "ELDK"      7.142857
            1 3.857143   6.9 0 21 0 "ELDK"      7.357143
     3.333333 7.285714   6.6 0 21 2 "KF"        4.714286
    2.3333333 7.285714   6.6 0 21 1 "KF"        5.642857
    1.6666666 7.285714   6.6 0 21 0 "KF"        7.357143
        3.875 8.571428   2.3 0 10 2 "LA"        3.642857
     5.333333 8.571428   2.3 0 10 1 "LA"        4.428571
            1 8.571428   2.3 0 10 0 "LA"        8.428572
            2 9.214286   2.4 0  1 2 "NB"        4.714286
            1 9.214286   2.4 0  1 1 "NB"        7.583333
            2 9.214286   2.4 0  1 0 "NB"        5.714286
    1.1111112 7.071429   8.6 0 21 2 "RV"        7.142857
            1 7.071429   8.6 0 21 1 "RV"        7.928571
     3.333333 7.071429   8.6 0 21 0 "RV"               6
     3.111111 6.928571  25.9 1 21 2 "SDDK"      4.714286
            3 6.928571  25.9 1 21 1 "SDDK"           5.5
    2.3333333 6.928571  25.9 1 21 0 "SDDK"      7.357143
     3.888889 4.857143   7.7 0 21 2 "SFDK"      4.571429
    1.6666666 4.857143   7.7 0 21 1 "SFDK"      5.571429
    1.6666666 4.857143   7.7 0 21 0 "SFDK"      6.142857
    4.3333335 6.785714  23.4 0 21 2 "v"         5.214286
            6 6.785714  23.4 0 21 1 "v"              5.5
     2.666667 6.785714  23.4 0 21 0 "v"         6.714286
     2.909091        7  12.6 0  6 2 "AfD"       6.714286
     3.166667        7  12.6 0  6 1 "AfD"       9.428572
     6.666667        7  12.6 0  6 0 "AfD"       3.190476
     2.636364 8.315789  26.8 1 21 2 "CDUGE"     6.857143
     5.666667 8.315789  26.8 1 21 1 "CDUGE"     6.238095
    4.5833335 8.315789  26.8 1 21 0 "CDUGE"     6.476191
     3.636364 8.315789   6.2 1 21 2 "CSU"            6.6
          3.4 8.315789   6.2 1 21 1 "CSU"       7.619048
     3.090909 8.315789   6.2 1 21 0 "CSU"       6.523809
            .        .     . 0  6 2 "DieTier"          4
            3        .     . 0  6 1 "DieTier"          7
            2        .     . 0  6 0 "DieTier"        2.5
     3.727273 8.263158  10.7 0 21 2 "FDP"       5.666667
     3.083333 8.263158  10.7 0 21 1 "FDP"       5.047619
    1.8333334 8.263158  10.7 0 21 0 "FDP"            8.1
    1.4545455 4.842105   8.9 0 21 2 "GRUNEN"    7.333333
          1.5 4.842105   8.9 0 21 1 "GRUNEN"    8.476191
         3.75 4.842105   8.9 0 21 0 "GRUNEN"           5
     5.090909 6.055555   9.2 0 21 2 "LINKE"         4.85
    4.3333335 6.055555   9.2 0 21 1 "LINKE"     5.238095
    1.9166666 6.055555   9.2 0 21 0 "LINKE"     8.095238
            0      1.5     . 0  6 2 "Piraten"       5.25
    1.6666666      1.5     . 0  6 1 "Piraten"        8.6
     2.666667      1.5     . 0  6 0 "Piraten"        2.2
    2.3636363 5.578948  20.5 1 21 2 "SPD"       6.857143
    4.1666665 5.578948  20.5 1 21 1 "SPD"       5.476191
          5.5 5.578948  20.5 1 21 0 "SPD"       7.666667
          1.4 9.666667   3.7 0  1 2 "EL"           4.375
            1 9.666667   3.7 0  1 1 "EL"            8.25
     2.666667 9.666667   3.7 0  1 0 "EL"           4.375
          1.5 6.666667   .74 0  1 2 "KIDISO"    7.142857
            2 6.666667   .74 0  1 1 "KIDISO"        6.75
            2 6.666667   .74 0  1 0 "KIDISO"         7.2
    .16666667        9   5.3 0 21 2 "KKE"       6.222222
           .5        9   5.3 0 21 1 "KKE"           3.25
            0        9   5.3 0 21 0 "KKE"       9.111111
            1      5.5  3.44 0  1 2 "MR25"         7.625
    end
    The following is the model, output, and marginsplot of my pooled analysis:

    Code:
    eststo seven: reg y c.x1##c.x2 x3 x4 x5 i.country if x6!=2 & in_model_1==1, vce(cl x7)
    margins, dydx(x1) at(x2=(0(0.5)10))
    marginsplot, title("Average Marginal Effects of x1 on y (95% CIs)") xtitle("x2") ///
    addplot(histogram x2 if x6!=2, freq width(0.5) yaxis(2) yscale(alt axis(2)) fcolor(%25) lc(black%50))
    Code:
    Linear regression                               Number of obs     =        258
                                                    F(20, 129)        =      11.74
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.3411
                                                    Root MSE          =     1.2825
    
                                       (Std. err. adjusted for 130 clusters in x7)
    ------------------------------------------------------------------------------
                 |               Robust
               y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
              x1 |   .0019245   .2214274     0.01   0.993     -.436175    .4400241
              x2 |   .0689307   .0806277     0.85   0.394    -.0905932    .2284545
                 |
       c.x1#c.x2 |  -.0763485   .0301067    -2.54   0.012    -.1359155   -.0167816
                 |
              x3 |   .0439528   .0079439     5.53   0.000     .0282355      .05967
              x4 |   .0707958    .146916     0.48   0.631    -.2198811    .3614728
              x5 |   .0015976   .0101892     0.16   0.876     -.018562    .0217572
                 |
         country |
          2. dk  |   .1339352   .2632911     0.51   0.612    -.3869927     .654863
          3. ge  |   .8146273   .2339428     3.48   0.001     .3517659    1.277489
          4. gr  |   .5158083   .3239174     1.59   0.114    -.1250702    1.156687
         5. esp  |   .0285887   .2713843     0.11   0.916    -.5083519    .5655292
          6. fr  |   .7092217   .2187707     3.24   0.002     .2763785    1.142065
         7. irl  |    -.68028   .2382757    -2.86   0.005    -1.151714   -.2088458
          8. it  |   .2901257   .4193663     0.69   0.490    -.5396007    1.119852
         10. nl  |   -.420152   .1813169    -2.32   0.022    -.7788919   -.0614122
         11. uk  |   -.787641   .2225044    -3.54   0.001    -1.227871   -.3474106
        12. por  |   .7449562   .3259486     2.29   0.024     .1000589    1.389854
        13. aus  |  -.8840575   .3138242    -2.82   0.006    -1.504966   -.2631485
        14. fin  |   .6545474   .1682645     3.89   0.000     .3216321    .9874628
         16. sv  |   .4814474    .216804     2.22   0.028     .0524953    .9103995
        38. lux  |  -1.392723   .4609494    -3.02   0.003    -2.304723   -.4807235
                 |
           _cons |   6.653327   .6476639    10.27   0.000     5.371908    7.934746
    ------------------------------------------------------------------------------
    Click image for larger version

Name:	Graphstatalist1.png
Views:	1
Size:	127.4 KB
ID:	1765872



    Unfortunately, when I run the analyses for the two values of x6 I am interested in, the results do not add up. Both are not significant, and while the model for x6=0 is in the expected direction, the p-value is barely <0.10. How can these results be explained?

    Code:
    eststo seven2: reg y c.x1##c.x2 x3 x4 x5 i.country if x6==0 & in_model_2==1
    margins, dydx(x1) at(x2=(0(0.5)10))
    marginsplot, title("Average Marginal Effects of x1 on y (95% CIs)") xtitle("x2") ///
    addplot(histogram x2 if x6==0 & in_model_2==1, freq width(0.5) yaxis(2) yscale(alt axis(2)) fcolor(%25) lc(black%50))
    
    eststo seven3: reg y c.x1##c.x2 x3 x4 x5 i.country if x6==1 & in_model_3==1
    margins, dydx(x1) at(x2=(0(0.5)10))
    marginsplot, title("Average Marginal Effects of x1 on y (95% CIs)") xtitle("x2") ///
    addplot(histogram x2 if x6==1 & in_model_3==1, freq width(0.5) yaxis(2) yscale(alt axis(2)) fcolor(%25) lc(black%50))
    Code:
    eststo seven2: reg y c.x1##c.x2 x3 x4 x5 i.country if x6==0 & in_model_2==1
          Source |       SS           df       MS      Number of obs   =       128
    -------------+----------------------------------   F(20, 107)      =      4.76
           Model |  150.651879        20  7.53259395   Prob > F        =    0.0000
        Residual |  169.297938       107  1.58222372   R-squared       =    0.4709
    -------------+----------------------------------   Adj R-squared   =    0.3720
           Total |  319.949817       127   2.5192899   Root MSE        =    1.2579
    
    ------------------------------------------------------------------------------
               y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
              x1 |   .1074972   .3360307     0.32   0.750    -.5586445    .7736388
              x2 |   .0709244    .180704     0.39   0.695    -.2873001     .429149
                 |
       c.x1#c.x2 |    -.08677   .0495838    -1.75   0.083    -.1850642    .0115242
                 |
              x3 |   .0609723    .016123     3.78   0.000     .0290104    .0929342
              x4 |    .137091   .3109673     0.44   0.660    -.4793655    .7535475
              x5 |   .0312748   .0174638     1.79   0.076    -.0033452    .0658947
                 |
         country |
          2. dk  |  -.8865927   .5647997    -1.57   0.119    -2.006242    .2330568
          3. ge  |  -.4909954   .6170599    -0.80   0.428    -1.714245    .7322539
          4. gr  |   .0128348   .6165485     0.02   0.983    -1.209401     1.23507
         5. esp  |  -.3776846    .531973    -0.71   0.479    -1.432259    .6768899
          6. fr  |   .3147261   .5968196     0.53   0.599     -.868399    1.497851
         7. irl  |  -1.004265   .6580029    -1.53   0.130    -2.308679    .3001494
          8. it  |  -.3107438   .6299975    -0.49   0.623     -1.55964    .9381529
         10. nl  |  -1.235017   .5238642    -2.36   0.020    -2.273517   -.1965178
         11. uk  |   -2.06065   .6022026    -3.42   0.001    -3.254446   -.8668532
        12. por  |   .5106341   .6342881     0.81   0.423    -.7467681    1.768036
        13. aus  |   -1.83555    .699734    -2.62   0.010    -3.222691   -.4484088
        14. fin  |  -.5542002   .6047479    -0.92   0.362    -1.753042    .6446421
         16. sv  |  -.7864335   .5936533    -1.32   0.188    -1.963282     .390415
        38. lux  |  -1.986957   .6895806    -2.88   0.005     -3.35397   -.6199439
                 |
           _cons |   6.848582   1.376672     4.97   0.000      4.11949    9.577673
    ------------------------------------------------------------------------------
    eststo seven3: reg y c.x1##c.x2 x3 x4 x5 i.country if x6==1 & in_model_3==1
    
          Source |       SS           df       MS      Number of obs   =       130
    -------------+----------------------------------   F(20, 109)      =      5.44
           Model |  132.957861        20  6.64789305   Prob > F        =    0.0000
        Residual |   133.32071       109  1.22312578   R-squared       =    0.4993
    -------------+----------------------------------   Adj R-squared   =    0.4075
           Total |  266.278571       129  2.06417497   Root MSE        =     1.106
    
    ------------------------------------------------------------------------------
               y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
              x1 |  -.6511678   .3113023    -2.09   0.039    -1.268159   -.0341768
              x2 |  -.0603839    .112382    -0.54   0.592    -.2831215    .1623536
                 |
       c.x1#c.x2 |   .0043718   .0439778     0.10   0.921    -.0827908    .0915343
                 |
              x3 |   .0330859   .0147282     2.25   0.027     .0038952    .0622767
              x4 |  -.0551796   .2737978    -0.20   0.841    -.5978379    .4874787
              x5 |  -.0272743    .014766    -1.85   0.067      -.05654    .0019914
                 |
         country |
          2. dk  |   1.199387   .4903972     2.45   0.016     .2274362    2.171339
          3. ge  |   2.207378     .54445     4.05   0.000     1.128296     3.28646
          4. gr  |   1.114636   .5351736     2.08   0.040     .0539394    2.175333
         5. esp  |   .4695195   .4680648     1.00   0.318    -.4581697    1.397209
          6. fr  |   1.346269   .5183206     2.60   0.011     .3189745    2.373564
         7. irl  |   -.153926   .5391264    -0.29   0.776    -1.222457     .914605
          8. it  |   1.005785   .5587891     1.80   0.075    -.1017168    2.113287
         10. nl  |        .46   .4622662     1.00   0.322    -.4561965    1.376197
         11. uk  |   .6379496   .5302325     1.20   0.232    -.4129541    1.688853
        12. por  |   1.022597   .5574282     1.83   0.069    -.0822071    2.127402
        13. aus  |  -.0046027   .6292805    -0.01   0.994    -1.251816    1.242611
        14. fin  |   2.056141   .5390641     3.81   0.000     .9877332    3.124548
         16. sv  |   1.785488   .5192244     3.44   0.001      .756402    2.814574
        38. lux  |  -.5289811   .6191384    -0.85   0.395    -1.756093     .698131
                 |
           _cons |   7.331186   .8890873     8.25   0.000     5.569044    9.093328
    ------------------------------------------------------------------------------
    Click image for larger version

Name:	Graphstatalist2ec.png
Views:	1
Size:	125.5 KB
ID:	1765873
    Click image for larger version

Name:	Graphstatalist3cu.png
Views:	1
Size:	116.7 KB
ID:	1765874

    Sincerely
    Mattia

  • #2
    Whether the model is pooled or not pooled is not the only difference between these models. It also looks like you are estimating the models on different subsets of the data. The first model is on observations where x6 equals 0 or 1 (assuming x6 is never negative), the second on observations where x6 equals 0, and the third on observations where x6 equals 1. You also restrict the observations based on the three in model variables: in_model_1, in_model_2, and in_model_3. It could be that your sample sizes for the subsets for the last two models are too small to see the interaction, or the size of the effect could depend on the subset of observations used. It is unclear why this is happening based on the information you've provided here. You might try to isolate these differences between models and test them one at a time to try to see where the difference emerges. For example, if you run the first model exactly as it is but without the vce() option at the end, does the interaction change?

    Comment


    • #3
      Many of the coefficients are different (signs, sizes, significance), including the fixed effects.

      Comment


      • #4
        Dear Daniel and George,


        To clarify, I referred to unpooled models to indicate that I am comparing the (first) model on observations where x6 equals 0 or 1 (pooled) to the second and third models where I subset the data based on the value of x6 (either 0 or 1). Plus, x6 is a trichotomous variable (x6=2 is excluded from the analysis).
        The in_model_* are actually useless conditions in this context, as they do not change the number of observations (the n of all the other models I run are based on the n presented here). Sorry for the additional confusion.


        Thanks for your answers. I guess I got what you are saying. May the results and statistical significance of the subsets thus reflect the smaller sample sizes? My initial understanding was the following: if I find that an interaction effect exists and the effect of x1 on y is conditional on x2, then by subsetting the sample based on a meaningful variable (in this case x6), the sum of the two interaction effects, should provide me with something approximate to the pooled interaction effect. Alternatively, I would have expected that, if one of the two subsets would have returned me with a non-significant interaction coefficient (in this case when x6=1), then the other subset (in this case x6=0) would have returned me a much more statistically significant interaction coefficient, to "offset" the other subsample, ultimately "making sense" of the results of the first model.
        In short, shall I not expect that?

        Finally, I tried Daniel's suggestion and removed the vce() option, yet the confidence intervals and significance of the effect does not change much (still p<0.05).

        Sincerely
        Mattia



        Comment


        • #5
          In the pooled model, you are imposing the constraint that all the coefficients are the same across x6. It is not uncommon for the coefficients to vary across subsamples (though you'd typically prefer they not).

          If you just wanted to know whether x6 has an effect on the interaction, then you could interact x6 with x2*x1 (and maybe x1 too) in the pooled sample.

          Comment


          • #6
            Dear George,

            thanks a lot, all clear. I will give it a try.

            Sincerely
            Mattia

            Comment

            Working...
            X