Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • testing difference between logit coefficients

    Hello,

    I am currently writing work in my undergraduate degree, and as part of it, we ran a logit regression for two different groups for the same variables. We examined through suest the difference between the coefficients. However, as I understand it, the original coefficients in the logit regression are meaningless (given the fact that it is a latent model). Therefore, we calculated the APE for each variable.

    Does the fact that the logit regression's original coefficients are meaningless impair my ability to examine the statistical difference between coefficients?


    Thank you.
    Last edited by Merav Steinfeld; 19 Sep 2021, 02:34.

  • #2
    The coefficients of the logit and probit are not meaningless. I do not know where you read or who told you that they are meaningless.

    If -suest- works post logit, then what you have done is the correct way to compare coefficients.

    Comment


    • #3
      First of all, thank you for the reply.
      Second, I may have explained myself inaccurately. I have been told that the very value of the coefficients represents a latent variable. Hence, the only thing I get from Logit regression is the coefficient sign(-\+) and its significance. I have been told that if I am interested in examining the true value of the coefficient, I need to perform certain statistical procedures.

      I performed the statistical command to examine the difference between the coefficients as usual through suest. Still, because I was instructed that the value of the coefficients is inaccurate, I did not know if the test was valid in such a case.

      Comment


      • #4
        With respect, it sounds like either (1) you may have been the recipient of some bad, or at least confusing, information; or (2) there is a lot that has gotten lost in translation. If you have time—either now or later—I would strongly encourage spending some time with a good text on logistic (or probit) regression so that you get a better understanding both of the model itself and of how to interpret the coefficients.

        @Joro Kolev answered your question by confirming that -suest- is an appropriate way to perform such comparisons; another would be to fit the model to both groups simultaneously and include interaction terms between the group indicator and the other covariates. But you are right to ask what is being tested with such a procedure. Specifically, you are testing whether the model coefficients on the logit scale are the same when the model is fit to both groups. Much, if not most, of the time, this is not a very interesting (null) hypothesis; it is often highly unlikely that the coefficients would be exactly the same across two groups in a population (assuming these are observational data). What matters more is how much the coefficients differ relative to the standard error of the difference. Thus, plotting those differences and their standard errors (or CIs) would be more informative. If you want, you might spend some time learning about the problems and limitations with null hypothesis significance testing—it has its place but is grossly overused.

        Logistic regression (and probit regression) can be developed/motivated in two ways. One of those ways is via the concept of an underlying (continuous) tolerance distribution that is shifted up or down; in the case of logistic regression that is the logistic distribution, whereas in the case of probit regression it is the Normal distribution. We don't observe the actual values from the distribution (you might call them latent), but rather only whether they exceed a certain cutoff (the resulting outcome is thus coded 1 if the latent value exceeds the cutoff or 0 otherwise). In this case, the raw coefficients from the logit model are direct analogues to what you would estimate if you were lucky enough to observe the "true" (latent) values and regress those on the covariates. So as @Joro Kolev said, the coefficients are certainly not meaningless.

        Alternatively, you can develop the logistic regression model without referring to an underlying tolerance distribution; in that case, the exponentiated values of the coefficients may be interpreted as odds ratios. Some folks find it difficult to think in terms of odds, so for this reason (and others) we often compute various types of model summaries such as the APE. If that's what you're doing, then you may want to compare your two groups in terms of this particular summary statistic instead of the raw coefficients. You may do this via the -margins- command—both after -suest- and after the interaction term approach.

        Finally, there is an issue that arises when comparing coefficients from a logit (or probit) model across groups; namely, there is a possibility that differences in the coefficients may reflect differences in the amount of residual variation between groups (i.e., differences in the variance of the underlying distribution). This results from the fact that, unlike in a linear regression where you estimate the variance of the residuals, with logistic regression estimated from binary data this is indistinguishable from the values of the coefficients themselves. It is possible (though probably not likely) that this is part of what is motivating your concern/question about directly comparing coefficients across groups. If so, @Richard Williams has a nice paper on this that may help (https://journals.sagepub.com/doi/10....49124109335735).

        Comment


        • #5
          My guess is that you are aware of the problems of logistic regressions as pointed out, for example, here (https://academic.oup.com/esr/article...26/1/67/540767). This paper also shows some ways how and when to compare coefs (especially in nested regressions).
          Best wishes

          (Stata 16.1 MP)

          Comment

          Working...
          X