How to interpret Oaxaca-Blinder decomposition results

Matthew Williams

Join Date: Feb 2021
Posts: 195

How to interpret Oaxaca-Blinder decomposition results

11 Feb 2021, 19:28

Dear All,

I am new here and I would like to seek your advice on how to interpret Oaxaca-Blinder decomposition results. Specifically, I want to explain differences in depression between two countries (A and B). In this exercise I use -Oaxaca- command (introduced by Ben Jann in 2008) with options -logit- and -weight (1) specified. The command I used is as follows:

Code:

oaxaca depression age gender area work edu ses_cat mstt club fsupp console alone, by(country) logit weight(1)

The results are

Code:

Blinder-Oaxaca decomposition                    Number of obs     =      6,265
                                                  Model           =      logit
Group 1: country = 0                              N of obs 1      =       3779
Group 2: country = 1                              N of obs 2      =       2486

------------------------------------------------------------------------------
  depression |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
overall      |
     group_1 |   .5654935   .0080592    70.17   0.000     .5496977    .5812893
     group_2 |   .4485117   .0099587    45.04   0.000     .4289929    .4680304
  difference |   .1169819   .0128112     9.13   0.000     .0918723    .1420914
   explained |  -.0618174   .0173442    -3.56   0.000    -.0958113   -.0278235
 unexplained |   .1787992   .0209686     8.53   0.000     .1377015    .2198969
-------------+----------------------------------------------------------------
explained    |
         age |   .0014983   .0009054     1.65   0.098    -.0002762    .0032727
      gender |  -.0026961   .0012159    -2.22   0.027    -.0050793   -.0003129
        area |   .0047169   .0015552     3.03   0.002     .0016689     .007765
        work |   .0041793   .0019572     2.14   0.033     .0003432    .0080154
         edu |   -.003192    .001866    -1.71   0.087    -.0068493    .0004653
     ses_cat |   .0011321   .0029647     0.38   0.703    -.0046785    .0069428
        mstt |    .000751   .0031287     0.24   0.810    -.0053811    .0068831
        club |  -.0078419    .005699    -1.38   0.169    -.0190117    .0033279
       fsupp |  -.0046566    .002206    -2.11   0.035    -.0089803    -.000333
     console |  -.0555361   .0156262    -3.55   0.000     -.086163   -.0249093
       alone |  -.0001722   .0009044    -0.19   0.849    -.0019447    .0016003
-------------+----------------------------------------------------------------
unexplained  |
         age |   -.305768    .127458    -2.40   0.016    -.5555811   -.0559548
      gender |   .0122104   .0183583     0.67   0.506    -.0237711    .0481919
        area |   .0277098   .0086721     3.20   0.001     .0107129    .0447068
        work |    .005129   .0119323     0.43   0.667    -.0182578    .0285159
         edu |  -.0006374    .012872    -0.05   0.961    -.0258661    .0245912
     ses_cat |  -.0805741   .0331369    -2.43   0.015    -.1455212    -.015627
        mstt |  -.0290486   .0123182    -2.36   0.018    -.0531918   -.0049053
        club |   .0113543   .0112411     1.01   0.312    -.0106778    .0333863
       fsupp |   .0216929    .027068     0.80   0.423    -.0313594    .0747453
     console |   .1185258   .1612595     0.74   0.462    -.1975371    .4345887
       alone |  -.0141596   .0051225    -2.76   0.006    -.0241995   -.0041197
       _cons |   .4123646   .2173369     1.90   0.058    -.0136078    .8383371
------------------------------------------------------------------------------

I got a negative value for the explained component and a positive value for the unexplained component which I do not know who to interpret these results. In addition, I also have two additional questions:
1) Since I used the -logit- option so do I need to use -eform- option to report results?
2) How to interpret significant coefficients in the unexplained component?

Thank you.

Tags: None

FernandoRios

Join Date: Apr 2014

Posts: 2469
#2

12 Feb 2021, 09:32

Hi Matthew
so couple of pointers.
1. Even if you are using "probit" or "logit" there is no need to use "eform" option. the results can be read as how much differences in coefficients or average characteristics contribute to the observed difference in the average probability of depression rate.
2. regarding on interpretation. Based on the linear decomposition (OB implements a local linear approximation), People in country 0 have a higher incidence of depression than country 1. .
This gap seems to be explained by differences in "the influence of factors" that affect depression. (betas). However, based on characteristics alone, people in country 0 have characteristics that should have made them less likely to be depressed.
For the rest of your variables, the interpretation will depend on the area of research (I have not worked in this particular area) and how variables are defined. Other than that, i could only give you a description of what is positive or negative and significant.
HTH
3 likes
Comment
Rhency Legaspi

Join Date: Mar 2021

Posts: 6
#3

12 Mar 2021, 02:33

Hi, may I ask how to interpret the coef. signs (+/-) of the overall and detailed decompositions above? Thank you!
Comment
Anja Siegert

Join Date: Jun 2024

Posts: 15
#4

19 Jul 2024, 02:13

Hello everyone, I have a question on this topic. How can I interpret the significance of the effects? If a factor is not significant, e.g. in the unexplained part, does it not contribute significantly to discrimination?
Comment

George Ford

Join Date: Aug 2014
Posts: 3152

19 Jul 2024, 09:47

Fernando's interpretation is correct.

The overall means difference is positive (people in country1 are more depressed on average than those in country0) (0.117).

But, based on the regressors ("endowments"), you'd expect country1 to have lower depression (-0.0618).

Thus, the higher level of depression in country1 is determined not by the X's, but by something else that is unexplained (0.179).

0.117 = 0.179 - 0.0618.

Here's a breakdown of what's going on:

group_1 is the predicted mean for group1 based on its Xs.
group_2 is the predicted mean for group2 based on its Xs.
difference is just the difference between these two.
explained is the coefficient from the reference model (of your choosing, here pooled) multiplied by the differences in the Xs between the two groups.
unexplained is the group dummy in the pooled model.

Code:

clear all
sysuse auto, clear

oaxaca mpg weight displacement, by(foreign) pooled

reg mpg weight displacement foreign
matrix P = e(b)

tabstat weight displacement foreign, by(foreign) save
matrix Ax = r(Stat1)
matrix Bx = r(Stat2)

di "group_1" _col(20) P[1,1]*Ax[1,1] + P[1,2]*Ax[1,2] + P[1,3]*Ax[1,3] + P[1,4]
di "group_2" _col(20) P[1,1]*Bx[1,1] + P[1,2]*Bx[1,2] + P[1,3]*Bx[1,3] + P[1,4]
di "difference" _col(20) (P[1,1]*Ax[1,1] + P[1,2]*Ax[1,2] + P[1,3]*Ax[1,3] + P[1,4])-( P[1,1]*Bx[1,1] + P[1,2]*Bx[1,2] + P[1,3]*Bx[1,3] + P[1,4]) 
di "explained" _col(20) P[1,1]*(Ax[1,1]-Bx[1,1]) + P[1,2]*(Ax[1,2]-Bx[1,2])
di "unexplained" _col(20) P[1,3]*(Ax[1,3] - Bx[1,3])

** FOR WEIGHT(1)
oaxaca mpg weight displacement, by(foreign) weight(1) noisily
qui reg mpg weight displacement if foreign
matrix P = e(b)
qui reg mpg weight displacement if ~foreign
matrix Z = e(b)

di "group_1" _col(20) Z[1,1]*Ax[1,1] + Z[1,2]*Ax[1,2] + Z[1,3]
di "group_2" _col(20) P[1,1]*Bx[1,1] + P[1,2]*Bx[1,2] + P[1,3]
di "difference" _col(20) (Z[1,1]*Ax[1,1] + Z[1,2]*Ax[1,2] + Z[1,3]) - (P[1,1]*Bx[1,1] + P[1,2]*Bx[1,2] + P[1,3])
di "explained" _col(20) Z[1,1]*(Ax[1,1]-Bx[1,1]) + Z[1,2]*(Ax[1,2]-Bx[1,2])
di "unexplained" _col(20) (P[1,1] - Z[1,1])*Bx[1,1]+ (P[1,2] - Z[1,2])*Bx[1,2] + (P[1,3] - Z[1,3])

Announcement

How to interpret Oaxaca-Blinder decomposition results

Comment

Comment

Comment

Comment