Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Margins versus marginal effects significance

    Hi, please let me know if these questions are too simple for this forum.

    I have 3 waves of data across 25 years for participants with a history of childhood abuse (yes/no). Mental disorder (yes/no) was recorded at each of the three data waves. I want to know whether risk of receiving a diagnosis changes across age. I have estimated a mixed effects poisson model:

    mepoisson diagnosis c.age i.abuse c.year || id:,irr

    When I estimate margins and plot this:
    margins abuse, at(age=(20(5)85)) vsquish
    marginsplot

    I have a graph showing the marginal predicted means for the two groups, those who have experienced abuse and those who have not. At the younger and older ages, the error bars overlap.

    However, when I use:
    margins, dydx(abuse) at(age=(20(5)85)) vsquish
    The results indicate that the groups are significantly different from 0 at every age group except for the very old.

    I understand that the estimated marginal effect (dydx) is the difference between the margins for each group at each age.

    1) How do I present these results? Do I say that the difference between the groups is significantly different at all age groups until 80 or do I present the marginal effects for both groups, which suggests that the differences between the groups are not significant at age 20, 25, and over 60?

    2) How do I describe these differences (the only examples I can find are for logistic regression).

    Do I say that at 1, the risk of mental disorder is 12% higher for people who have a history of abuse than for people who do not have a history of abuse?

    | Delta-method
    | dy/dx Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    0.anyabuse | (base outcome)
    -------------+----------------------------------------------------------------
    1.anyabuse |
    _at |
    1 | .1174703 .0375274 3.13 0.002 .0439179 .1910227
    2 | .1113169 .032141 3.46 0.001 .0483217 .1743122
    3 | .1054859 .0276195 3.82 0.000 .0513527 .1596191

  • #2
    1) How do I present these results? Do I say that the difference between the groups is significantly different at all age groups until 80 or do I present the marginal effects for both groups, which suggests that the differences between the groups are not significant at age 20, 25, and over 60?
    You're confusing me. Just a bit earlier you said that the marginal effects show a significant difference at all ages except the very old. You're contradicting yourself here.

    In any case, in situations like this, what I like to do is present the predicted probabilities at each age group in both those with and without abuse (output of -margins-), and then add to that table the marginal effects (which you properly note are the differences between the groups) along with the confidence intervals for the marginal effects (output of -margins, dydx()-). (If you have enough space in your table for the CIs of the predicted probabilities themselves, that's nice, but less crucial, in my view.) I tend to work with marginal effects and confidence intervals and ignore p-values. But if in your milieu it is mandatory to show p-values, then you can use the p-values from the -margins, dydx()- output as well.

    If you were only to present the marginal effects, people would wonder if a 12% percentage point change is an increase from 1% to 13% or from 50% to 63%, or what. By presenting both the predicted probabilities and the marginal effects you give people a better sense of what's going on.

    It's also important to remember that in a non-linear model like Poisson, the marginal effects change with the predicted values. The marginal effects you calculated are average marginal effects, and you should clearly identify them as such in your tables, graphs, presentations, etc. Average marginal effects may well be the most useful marginal effect statistic in some contexts. In other contexts, marginal effects calculated at specific values of the other covariates might be more important. That's a scientific context issue that you have to decide for yourself. What questions is your audience interested in? What results do you have that help answer those questions?

    2) How do I describe these differences (the only examples I can find are for logistic regression).

    Do I say that at 1, the risk of mental disorder is 12% higher for people who have a history of abuse than for people who do not have a history of abuse?
    No. You say that on average the risk of mental disorder diagnosis is 12 percentage points higher for people who have a history of abuse...

    Comment


    • #3
      Thank you Clyde, that was extremely helpful.

      I have copied two lines of the table I produced using,

      mepoisson diag c.age i.abuse i.violence i.neglect c.year || id:, irr
      margins abuse, at(age=(20(10)85))
      margins, dydx(abuse) at(age=(20(10)85))
      Predicted Probabilities of Diagnosis
      No Abuse Abuse Average Marginal Effects (AME) AME 95% Confidence Intervals
      Age
      20 .08 .21 .13 .05 - .22
      80 .03 .09 .06 .01 - .10
      1) Am I correct in interpreting this as:
      Including the average influence of domestic violence and neglect, if you have not experienced childhood abuse at age 20 there is an 8% risk of having a diagnosis of a mental disorder compared to a 21% risk if you have experienced abuse. On average, for women who have a history of abuse, compared with women who do not have a history of abuse, the risk of a diagnosis of a mental disorder is 13 percentage points higher at age 20 whereas it is only 6 percentage points higher at age 80.

      2) You noted that marginal effects at specific values may be of interest; this was also a helpful comment.

      If I instead calculate the predicted marginal effects at 0 for the binomial variables family violence and neglect

      margins abuse, at(age=(20(10)85) violence=0 neglect = 0)

      Do I interpret this as, at age 20, on average, women who have a history of abuse have an x% risk of having a diagnosis over and above the effects of family violence and neglect.

      2) Can I ask one additional question, I am interested in whether the association between abuse and diagnosis changes over the life course and so I spent some time considering whether I should have included an interaction term in my model [mepoisson diag c.age##i.abuse i.var2 i.var3 c.year || id:] and then use that model to calculate the marginal effects. I wasn't clear on how to interpret this (and the results were very similar) and so I did not include the interaction. Is there a reason that I should include the interaction term and then continue on to calculate the marginal effects of abuse at different ages as above.

      Your help is very much appreciated.

      Comment


      • #4
        I agree with your interpretation in 1).

        For your first 2), this is not exactly right. First of all, you will get results for both women who have experienced abuse, and (separate) results for women who have not. And the results would not be "over and above" effects of violence and neglect. They would be effects that apply to a population of women who have not experienced violence nor neglect. If you want estimates of effects "over and above" those of violence and neglect, I would think you would specify violence = 1 and neglect = 1 in your -at()- option.

        For your second 2), this is actually somewhat complicated. Because the logistic model is non-linear, when we look at probabilities of outcomes, it is typical to find this kind of situation. For example, the results you show above, in probability metric, show an interaction between age and abuse. This despite the absence of any interaction term in your model (which would be the equivalent of having an interaction coefficient of zero.) Now some people don't consider this type of "interaction" real. Others do. You might want to look at the very closely related discussion starting at #23 in https://www.statalist.org/forums/for...it-model/page2.

        To summarize that discussion, if you take the logistic model literally and give primacy to the odds ratio metric, these probability difference in differences are not "real" interactions. If you give primary to the outcome probabilities and view the logistic model as a convenience for calculating them that avoids some of the awkwardness that can arise with linear probability models, then the interaction coefficient is beside the point.

        My practice is to include the interaction term in the logistic model, but to interpret the presence or absence of interactions on the basis of the outcome probabilities. I do this on the theory that the model should fit better when given this additional flexibility, and thus I should get more accurate estimates of the predicted outcome probabilities. It is, however, quite common to observe, as you have in your data, that the interaction term makes only a small difference to these results.

        Comment


        • #5
          Thank you, your explanation here and in the previous discussion are invaluable.

          One more concern that I now have is, should I use pwcompare - does this test whether the marginal effect at 20 is significantly different from the marginal effect at 30 through 80?

          mepoisson diag c.age i.abuse i.violence i.neglect c.year || id:, irr
          margins abuse, at(age=(20(10)85)) pwcompare

          And whether the DID is different between 20 and 30, 40, 50...

          margins, dydx(abuse) at(age=(20(10)85)) pwcompare

          I concluded above that, On average, for women who have a history of abuse, compared with women who do not have a history of abuse, the risk of a diagnosis of a mental disorder is 13 percentage points higher at age 20 whereas it is only 6 percentage points higher at age 80. Is it relevant/appropriate to ask if 13 percentage points is significantly different from 6 percentage points, and is this what I am estimating with pwcompare?

          Comment


          • #6
            The "pw" in -pwcompare' means pairwise. So you will get 21 contrasts of all unordered pairs of age values selected among (20(10)85). You will not get an estimate for 30-85 as a single combined age group, nor any contrast of that with 20. But for your final question in #5, one of the 21 outputs will be a contrast of the marginal effect of abuse at age 20 with the marginal effect of abuse at age 80. The results will show you whether the difference is statistically significant. (Actually, you will get the differences in marginal effects between each of the 21 pairs of age along with confidence intervals. If you want to see the p-values directly, use the -effects- suboption of the -pwcompare- option.) Whether that is a relevant/appropriate question to ask depends on what your research goals are, so I cannot comment on that.

            Comment

            Working...
            X