Oaxaca decomposition (nledcompose): how to deal with categorical Y's

Giorgio Piccitto

Join Date: Oct 2016
Posts: 238

Oaxaca decomposition (nledcompose): how to deal with categorical Y's

20 Feb 2020, 05:10

Dear all,

I am new to the Oaxaca-Blinder decomposition, and I'ld like to know your opinion about how to set it up and interpret the results.

In my sample, I have respondents who answered by phone to the survey, and some other F2F, and I would like to explore if the different mode of response may lead to differences in the outcome.

I preliminary run a simple regression to check if actually something is there:

Code:

.  reg  dep4 edu aage male hsize acountry  f2f

      Source |       SS       df       MS              Number of obs =    2879
-------------+------------------------------           F(  6,  2872) =   51.04
       Model |  115.128287     6  19.1880478           Prob > F      =  0.0000
    Residual |  1079.74424  2872  .375955515           R-squared     =  0.0964
-------------+------------------------------           Adj R-squared =  0.0945
       Total |  1194.87253  2878  .415174609           Root MSE      =  .61315

------------------------------------------------------------------------------
        dep4 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         edu |  -.0154029   .0175565    -0.88   0.380    -.0498275    .0190217
        aage |   -.007566   .0012684    -5.97   0.000     -.010053   -.0050789
        male |  -.2096275   .0230887    -9.08   0.000    -.2548996   -.1643554
       hsize |  -.0274292    .008292    -3.31   0.001    -.0436881   -.0111702
    acountry |   .0264477   .0022047    12.00   0.000     .0221247    .0307706
         f2f |  -.1819788   .0302105    -6.02   0.000    -.2412153   -.1227423
       _cons |   1.483471   .0734162    20.21   0.000     1.339517    1.627425
------------------------------------------------------------------------------

The coefficient of f2f (which indicates the mode of response) is significant, so that means that it could be worthy to go on with a decomposition (I believe).

Code:


. nldecompose, by(f2f): reg  dep4   male aage  hsize acountry  edu

                                                   Number of obs (A) =    1293
                                                   Number of obs (B) =    1387

------------------------------------------------------------------------------
      Results |      Coef.  Percentage
--------------+---------------------------------------------------------------
 Omega = 1    |
         Char |   .3214277   188.1257%
         Coef |  -.1505698  -88.12575%
--------------+---------------------------------------------------------------
 Omega = 0    |
         Char |  -.0200212  -11.71803%
         Coef |   .1908791    111.718%
--------------+---------------------------------------------------------------
          Raw |   .1708579        100%
------------------------------------------------------------------------------

This is the results of the decomposition, which puzzles me a bit. What does the percentages over 100% mean? How should I interpret these results?

Here an example of my data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(acountry aage dep4) double f2f float(edu male hsize)
14 23 1 0 2 1 2
14 47 1 0 1 1 4
14 48 1 0 3 0 4
14 30 1 1 2 0 2
14 49 1 0 3 0 2
14 29 1 0 2 1 1
14 23 1 0 3 1 2
14 49 1 0 2 0 4
14 22 4 0 3 0 1
14 30 1 0 2 0 1
end
label values acountry acountry
label def acountry 14 "Germany", modify
label values aage NoLabel
label values dep4 Tfreq3
label def Tfreq3 1 "Seldom or Never", modify
label def Tfreq3 4 "Most or all of the time", modify
label values f2f f2f_lab
label def f2f_lab 0 "0. only processed in CAWI phase", modify
label def f2f_lab 1 "1. processed by an interviewer", modify

Your help would be absolutely appreciated.

Thanks in advance, best, G.

Tags: None

Giorgio Piccitto

Join Date: Oct 2016

Posts: 238
#2

21 Feb 2020, 03:11

Noone with some tips up?

Thanks, G
Comment
Sven-Kristjan Bormann

Join Date: Jul 2018

Posts: 310
#3

21 Feb 2020, 05:12

Why did you use nldecompose in the first place not the normal oaxaca-decomposition? You did not tell us what your dependent variable is.
My guess is also that not many people are using nldecompose in their daily work and rely more on -oaxaca- or -oaxca_rif- for decompositions. Therefore, you are not getting a fast response here.
You should probably also read the article which introduces this command. You can find the link to the article at the bottom of the help file or https://www.stata-journal.com/articl...article=st0152
After reading the article, the negative percentages should make more sense to you. You read in the article also more about the meaning of Omega.
Negative signs in outcome decomposition methods usually mean that one group has an advantage over another group -> e.g. characteristics which lower the observed outcome gap or that the same characteristics are valued higher for one group than for the other group.
Comment
Giorgio Piccitto

Join Date: Oct 2016

Posts: 238
#4

21 Feb 2020, 05:27

Dear Sven-Kristjan,

thanks a lot for your answer. I actually already went through the paper you mention, but still I did not realize why I obtain some percentage which are above the 100% (while is quite clear because I obtain negative values).

DO you have any explanation for that?

Thanks, G
Comment
Sven-Kristjan Bormann

Join Date: Jul 2018

Posts: 310
#5

21 Feb 2020, 08:18

The percentages above 100% indicate for example that if Omega is 1 (weighting by the coefficients of the base group) and if the coefficients were the same but the characteristics different then the observed outcome difference would be 88% larger than observed.
The causal reason why this happens depends on your dataset and on other background knowledge about the compared groups.
The decomposition results indicate only that there are different components which increase and decrease the outcome gap at the same time. Furthermore, the weighting of these components matters for the interpretation, as indicated by the difference between Omega=1 and Omega=0. So you might want to run the decomposition with the option "omega(neumark)" to get a somewhat more detailed analysis of the outcome gap.
Comment

Announcement

Oaxaca decomposition (nledcompose): how to deal with categorical Y's

Comment

Comment

Comment

Comment