Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heterogeneous effects

    Dear statlisters,

    I really hope you can give me your opinion.
    For heterogeneity studies, there are 2 different ways to do it: either you split the sample and run two different regressions for each sub-population you are interested in, or you just interact the treatment variable with the covariate that defines the characteristic for the heterogeneity and you use the full sample.

    Both ways seem to be widely used in published papers without arguments for why is one way would be prefered.

    Please let me know whether you think one way is a better way to look at the heterogeneous effects, if you were a referee.

    Thank you.

  • #2
    The answer to your question is, not surprisingly: it depends. However, the first thing to note is that the difference is not that big. In essence (as far as the conditional mean is concerned), estimating two different models is the same as estimating one model with interaction terms for all explanatory/right-hand-side/x-variables. The only difference is that the two separate models allow the variance of the residuals to differ between groups. If you just use robust standard errors, then I would say there is no meaningful difference between the two models.

    What interaction effects allow is the possibility to get a more parsimonious model by adding interactions for some but not all explanatory variables. In general, parsimony is good, but only when it is justified by the data. So, you as an analyst have more options with an interaction model. With it you can estimate the two separate models if necessary, or more parsimonious model when the data allows.

    In practice, the choice of model is probably mostly driven by what the authors find easier to present.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Maarten Buis Thank you so much for your answer.
      Just to be sure, is what you said also right for non-linear models when we use marginal effects to compare sub-populations?

      Comment


      • #4
        The conditional mean part, yes. Heteroskedasticity is a much harder problem in non-linear models.
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          When I try both ways (splitting the sample versus including the interaction), the estimates are very different. I thought they would produce somehow similar results. I donot know the reason behind it.
          I look at the effect of spousal labour supply on individuals' labour supply. When I examine the heterogenous effects between households with income above the median (group 1) and households with income below the median (group 2), group 1 responds much more strongly. When I include an interaction between a dummy (=1 for households with income above the median) and my main independent variable, group 1 responds much more weakly (by summing the estimate of the main independent variable and the interaction).
          I would appreciate any suggestion on this.
          Thanks

          Comment


          • #6
            I want to test the significance of the estimates from two separate regressions when I split the sample, something like Chow test. Since I use xtreg, fe and as suggested by Clyde Schechter (https://www.statalist.org/forums/for...nel-regression), Chow test is shown as the estimate of the interaction when I run the full sample plus the interaction. However, in my case, the estimates when I include the interaction are totally opposite to the ones when I run separate regression.

            I wonder what commands in stata I can use to test the significance of the estimates when I run separate regression.

            Comment


            • #7
              Always do a formal test of interaction over splitting the sample when doing subgroup analyses. Here's a paper that uses simulation to show the high false-positive risk when subsetting your data to do subgroup analyses.

              Comment

              Working...
              X