Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpreting odds ratio interaction of continuous variables in Conditional Logit

    Dear all,

    I am struggling with the interpretation of interacted odds ratio in a conditional logit.
    I am familiar with the Stata tip 87 by Maarten Buis which details the advantages of using Odds Ratio (OR) in non-linear models, and gives an interpretation example using dichotomous variables (here).

    At the end of it, Maarten states "However, the basic argument still holds when using continuous variables and when controlling variables are added. Moreover, the argument is not limited to
    results obtained from logit. It applies to all forms of multiplicative effects, and so, for example, to odds ratios from other models
    "

    I use odds ratios from a conditional logit (clogit in Stata, which could be equivalent to xtlogit, when set properly).
    Let me give you the following example:

    Code:
    webuse restaurant,clear
    clogit chosen cost rating distance,group(family_id)
    outreg, or
    clogit chosen cost c.cost#c.income rating distance,group(family_id)
    outreg, or merge
    which gives:
    Code:
                             ---------------------------------------
                                                 chosen    chosen  
                             ---------------------------------------
                              cost               0.857     0.764   
                                                (8.88)**  (8.83)**
                              rating             2.380     2.516   
                                                (8.84)**  (9.21)**
                              distance           0.918     0.914   
                                                 (1.95)   (2.02)*  
                              c.cost#c.income              1.002   
                                                          (5.23)**
                              N                  2,100     2,100   
                             ---------------------------------------
                                       * p<0.05; ** p<0.01

    Four things differ from Maarten Buis' example:
    - control variables are added (rating, distance): but, as said earlier, this shouldn't be a trouble.
    - All variables (including interacted ones) are continuous
    - No baseline could be added (because of the fixed effects of conditional logit, see the code below)
    - Similarly, one of the interacted variable (income) cannot be added alone in the model (no within-group variance).

    Code:
    webuse restaurant,clear
    *Adding a baseline
    gen byte baseline=1
    clogit chosen cost c.cost#c.income rating distance baseline,group(family_id)
    outreg, or merge
    *Adding income alone first.
    clogit chosen c.cost##c.income rating distance,group(family_id)
    outreg, or merge
    Using the second model (interacted one) : I interpret cost's odds ratio as follows " odds of restaurants decrease by 23.6 % (1-0.764) when the cost per person increases of 1".

    Yet, I struggle to interpret the interaction term, which odds ratio turns above 1 and significant. The intuition would be to say that higher income families goes more in costly restaurant, but of how much?

    Any help would be appreciated,
    Thanks
    Charlie

  • #2
    You need to add something to your interpretation of cost: "odds of restaurants decrease by 23.6 % (1-0.764) when the cost per person increases of 1 if the household income is zero" To get a more meaningful effect you may want to create new variable that is household income centered at say 10 (the lowest observed value) or the mean, or some other meaningful value within the range of the data, and add that instead of household income.

    The interaction would be interpreted as: if household income increases by a 1000 dollars (I presume household income is annual in 1000s of dollars, but I don't know), then the odds ratio of cost increases (becomes less negative) by 0.2%, that does not sound a lot, but the range of household income is 10-126 (1000s? of dollars? per year?). Going from the lowest to the highest income , would lead to an increase of the odds ratio of cost of (1.002^116 - 1)*100% =26%. This is a computation where rounding errors become big very quickly, so better would be to compute this as di (exp(_b[c.cost#c.income]*116)-1)*100, which would result in a 31% increase in odds.

    The trick with the variable baseline is no longer needed. If it is possible, then Stata will include the baseline odds in the output. In clogit the baseline cannot be estimated, so we will have to do without that. Interpreting the baseline odds is nice to remind yourself and your audience what an odds is, but it is not necessary to interpret the interaction effect.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Dear Maarten,

      Thank you for you fast and very useful answer.

      Concerning the dataset details about units, I don't know, this was an illustrational example I found in Cameron and Trivedi microeconometrics using Stata, my real data being confidential (and therefore on an offline server). But your answer was clear enough with the assumption you made.

      I would like to make sure of some points :
      1- detailing if the household income is zero, is needed because the effect of income itself isn't reported. Let's assume income varies within groups (several observations in time), and we include it into the model, then the interpretation of the cost variable would be at mean value of income? (This could happen in my real data).
      Or to say it otherwise : odds ratios should be interpreted at the mean of control variables, and zero for remaining (non included) group-level characteristics, such as I could say "odds of restaurants decrease by 23.6 % (1-0.764) when the cost per person increases of 1 if the household income is zero and distance being of 5.013 miles (mean)"

      2- In di (exp(_b[c.cost#c.income]*116)-1)*100, the 116 value comes from the range of income (126-10), and the 1 comes from the definition of odds ratio (doesn't vary with income values, right?). This is a really nice trick to represent the real impact of one variable variation, thanks!

      3- Would it be correct to say that baseline effect is captured in the fixed effect, so we can estimate the interaction effects without bias?

      4- One of my rapporteur asked me to report marginal effects. Would interpreting this odds ratios interactions be enough? I argued that marginal effects weren't adapted to interaction in non-linear models (citing your Stata Tip, and Ai & Norton (2003)). Do you advice me to report marginal effects? If yes,how would you do that,

      Code:
       margins c.cost##c.income, expression(exp(predict(xb))) continuous
      margins cost##income, expression(exp(predict(xb))) conti
      returns respectively:
      only factor variables and their interactions are allowed
      cost: factor variables may not contain noninteger values

      So I think I'll stick to the interpretation of interaction terms odds ratio.

      Many thanks anyway, you've already answered the core of the question very well.
      Best,
      Charlie

      Comment


      • #4
        1. This has nothing to do with the fact that the main effect of income is removed from the model (or actually, implicitly included in the model via the fixed effects). This is a characteristic of interaction terms: If you interact two variables x1 and x2, then the main effect of x1 is the effect of x1 when x2 is 0 and vice versa. Now you can trick that to be more meaningful by centering x1 and x2 to meaningful values within the range of the data. The main effects are still the effects when the other is 0, but 0 now means something different and more meaningful. So if you want the main effect of x1 to be the effect of x1 when x2 is at its mean. You have to create a new variable x2c which is x2 - mean(x2). That way x2c is 0 when x2 is at its mean.

        2. Correct

        3. Correct

        4. The interpretation in terms of odds ratios is a complete and accurate description of the model. In that sense it is enough. You also need to take into account whether you audience can understand your interpretation. In some disciplines these are common, so most of your audience will be sufficiently trained to understand the concept of an odds and the concept of a ratio. In other disciplines that is less the case. On a practical note: you can't get the marginal effect for the interaction term directly in Stata. You can do some trickery, but I am no fan of marginal effects anyhow. So I don't do that a lot and thus don't know a lot about that. The fact that you use a fixed effects logit instead of a logit also result in some (manageable) complications, but, as before, you need to ask a marginal effects aficionado.
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Ok, thank you again Maarten,

          Thanks for the reminding in 1), I don't know why I was so confused, I used to know that...

          Great for the 2nd point, I'm going to use this range-variation illustration, this is very nice.

          I'll keep with the odds ratio, but thanks to you, give a proper and clear interpretation of them and their interactions.
          My audience is certainly more aware of marginal effects, but I'm sure they would understand if explained clearly enough, but to do so I needed to understand clearly myself.

          Thanks again,
          Best,
          Charlie

          Comment

          Working...
          X