Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpret coefficients measured in percent on binary dependent variable in panel

    Hello Statalist members,

    reading in the forum has helped me at lot in the last weeks, but I cannot find an answer for my – as I thought – simple question.


    I’m running xtreg, re on a panel with a binary dependent variable. The Dummy is 1 for default in financial crisis, and 0 for survive. My two explanatory variables are both capital ratios measured in percent.

    Now, how do I interpret the coefficients? For example with a coefficient like -0.20.


    Kind regards
    Fabian

    Last edited by Fabian Hagel; 05 Feb 2020, 09:30.

  • #2
    Since you are running a linear regression, the coefficients are marginal effects. So if your outcome variable is called Y and your variable X has a coefficient of -0.20 this means that the expected rate of change of Y per unit change in X is -0.20. In particular, if X is also a dichotomous variable, the expected difference between Y when X = 1 and Y when X = 0 is -0.20.

    Comment


    • #3
      Hello Clyde,

      thank you for the fast reply. So I guess "marginal effect" was the keyword missing in my search.

      So the correct interpretation would be: A 1 percentage point increase in X reduces the probability that y=1 by 20 percentage points?

      Comment


      • #4
        The use of "1 percentage point increase in X" implies that X is, itself, a percentage. If that's correct, then your interpretation in #3 is, indeed, correct, with one reservation. The verb "reduces" implies a causal relationship. If this is observational data, you should be very cautious in speaking causally about your results.

        Comment


        • #5
          Alright, I'm aware of the reservation. Thank you a lot!

          Kind regards
          Fabian

          Comment


          • #6
            Hi!

            I have a similar issue: I have a linear regression model with a dummy dependent variable (y). The independent variable is a scale from 1 to 4 (x).
            The model returns the coefficient .27. It seems like the same as the above example, except X is not in percentages.

            Would it be correct to say: A 1 point increase in X increases the likelihood of Y by 27%?

            Thanks for all the great help at the forum!

            Comment


            • #7
              No. A 1 point increase in X is associated with a 27 percentage point increase in the probability of Y.

              Your original phrasing "increases" is wrong because it implies causality. "27%" is wrong because that implies a multiplicative change, whereas the model is showing you an additive change. And "likelihood" is wrong because it is the probability of Y that is being modeled. The term "likelihood" refers to something else that is usually not even relevant in ordinary least squares linear regression.

              Comment


              • #8
                So close but still so far away! Thank you very much – that helped a lot!
                My understanding was sort of right, but I failed to capture the important nuances in the choice of words.

                Thanks again!

                Comment


                • #9
                  Hello, I have a similar regression coefficient interpretation question: I'm running a DID event study type model, where the treatment indicator is a binary variable equal to 1 for the treatment group and 0 for the control group. Some of my outcomes are indicator variables too, whereas some are logged values (income and wages). How would I interpret the coefficients?

                  For example, if the model returns a coefficient of -0.3 when log of wages is the outcome variable, can I say that the treated individuals experience 30% lower wages post-treatment, compared to those in the control group? Alternatively, if the model returns a coefficient of 0.01 when the dependent variable is a binary variable (=1 if recipient of welfare support), can I say that individuals in the treatment group have a 1 percentage point higher usage of welfare support post-treatment, compared to those in the control group?

                  Thank you.

                  Comment


                  • #10
                    For example, if the model returns a coefficient of -0.3 when log of wages is the outcome variable, can I say that the treated individuals experience 30% lower wages post-treatment, compared to those in the control group?
                    No, you can't say that because 0.3 is too large in magnitude to apply the approximation involved in that rule of thumb.
                    With a coefficient of -0.3, it follows that the expected value of log wages is 0.3 less in the treatment group than in the untreated group. In equations, this is:
                    Code:
                    log(wages) | treated = log(wages) | untreated - 0.3  in expectation.
                    
                    Exponentiating both sides:
                    exp(log(wages)|treated) = exp(log(wages)|untreated - 0.3)
                    
                    Exponential of a difference is the quotient of the exponentials:
                    exp(log(wages)|treated) = exp(log(wages)|untreated) / exp(0.3)
                    
                    Now, exp(.3) = 1.35 (to 2 decimal places).  And exp(log(anything)) == anything.
                    
                    So:
                    wages | treated = wages | untreated / 1.35 = 0.74 * wages | untreated.
                    
                    Or expressed as a percentage different,
                    wages|treated are 26% lower than wages|untreated.  (100%-74% = 26%)
                    That rule of thumb about multiplying the coefficient by 100 to get the percent change only works well when the coefficient is of small magnitude. It's a decent approximation for coefficients of magnitude < 0.1. It is not great between .1 and .2, and really not usable at all for anything bigger.

                    Alternatively, if the model returns a coefficient of 0.01 when the dependent variable is a binary variable (=1 if recipient of welfare support), can I say that individuals in the treatment group have a 1 percentage point higher usage of welfare support post-treatment, compared to those in the control group?
                    Yes, that's correct.

                    Comment


                    • #11
                      Hi Clyde,

                      Thank you very much for the detailed explanation, that really helped.

                      If I were to interpret these effects relative to the sample mean, I would just take the ratio of the coefficient over the pre-treatment sample mean? For e.g. if the pre-treatment log wages are 10.5, could I say that wages declined by (0.3/10.5)*100=2.9% relative to the pre-treatment mean value for the treatment group?

                      Thanks,
                      Ashani

                      Comment


                      • #12
                        If I were to interpret these effects relative to the sample mean, I would just take the ratio of the coefficient over the pre-treatment sample mean? For e.g. if the pre-treatment log wages are 10.5, could I say that wages declined by (0.3/10.5)*100=2.9% relative to the pre-treatment mean value for the treatment group?
                        No. All you could say along these lines is that for those earning the mean log wage, their log wages declined by 2.9% relative to pre-treatment. The calculation you are showing applies to the mean log wages, not to mean wages, and it applies only to those at the mean log wages pre-treatment, not to others earning more or less.

                        Comment


                        • #13
                          Got it, thank you so much!

                          Comment


                          • #14
                            Hi, I have a pretty similar question to those posted above but I am still struggling slightly on the correct interpretation. I have a binary dependent variable of whether an individual become a child bride (where 0 means they don't and 1 means they do). If I have a coefficient of 0.512 on one of my explanatory variables (whether or not there is a death of a household member - also a binary variable taking value 1 if there is), would I be able to interpret this as "experiencing a death in the household is associated with a 51.2 percentage point increase in the chance that an individual becomes a child bride"?
                            Many thanks in advance for the help!

                            Comment


                            • #15
                              If the regression you carried out is a linear regression, then, yes that would be the interpretation. But not if it was a logistic or Poisson regression.

                              Comment

                              Working...
                              X