Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Oaxaca decomposition (nledcompose): how to deal with categorical Y's

    Dear all,

    I am new to the Oaxaca-Blinder decomposition, and I'ld like to know your opinion about how to set it up and interpret the results.

    In my sample, I have respondents who answered by phone to the survey, and some other F2F, and I would like to explore if the different mode of response may lead to differences in the outcome.

    I preliminary run a simple regression to check if actually something is there:
    Code:
    .  reg  dep4 edu aage male hsize acountry  f2f
    
          Source |       SS       df       MS              Number of obs =    2879
    -------------+------------------------------           F(  6,  2872) =   51.04
           Model |  115.128287     6  19.1880478           Prob > F      =  0.0000
        Residual |  1079.74424  2872  .375955515           R-squared     =  0.0964
    -------------+------------------------------           Adj R-squared =  0.0945
           Total |  1194.87253  2878  .415174609           Root MSE      =  .61315
    
    ------------------------------------------------------------------------------
            dep4 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             edu |  -.0154029   .0175565    -0.88   0.380    -.0498275    .0190217
            aage |   -.007566   .0012684    -5.97   0.000     -.010053   -.0050789
            male |  -.2096275   .0230887    -9.08   0.000    -.2548996   -.1643554
           hsize |  -.0274292    .008292    -3.31   0.001    -.0436881   -.0111702
        acountry |   .0264477   .0022047    12.00   0.000     .0221247    .0307706
             f2f |  -.1819788   .0302105    -6.02   0.000    -.2412153   -.1227423
           _cons |   1.483471   .0734162    20.21   0.000     1.339517    1.627425
    ------------------------------------------------------------------------------
    The coefficient of f2f (which indicates the mode of response) is significant, so that means that it could be worthy to go on with a decomposition (I believe).

    Code:
    
    . nldecompose, by(f2f): reg  dep4   male aage  hsize acountry  edu
    
                                                       Number of obs (A) =    1293
                                                       Number of obs (B) =    1387
    
    ------------------------------------------------------------------------------
          Results |      Coef.  Percentage
    --------------+---------------------------------------------------------------
     Omega = 1    |
             Char |   .3214277   188.1257%
             Coef |  -.1505698  -88.12575%
    --------------+---------------------------------------------------------------
     Omega = 0    |
             Char |  -.0200212  -11.71803%
             Coef |   .1908791    111.718%
    --------------+---------------------------------------------------------------
              Raw |   .1708579        100%
    ------------------------------------------------------------------------------
    This is the results of the decomposition, which puzzles me a bit. What does the percentages over 100% mean? How should I interpret these results?

    Here an example of my data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(acountry aage dep4) double f2f float(edu male hsize)
    14 23 1 0 2 1 2
    14 47 1 0 1 1 4
    14 48 1 0 3 0 4
    14 30 1 1 2 0 2
    14 49 1 0 3 0 2
    14 29 1 0 2 1 1
    14 23 1 0 3 1 2
    14 49 1 0 2 0 4
    14 22 4 0 3 0 1
    14 30 1 0 2 0 1
    end
    label values acountry acountry
    label def acountry 14 "Germany", modify
    label values aage NoLabel
    label values dep4 Tfreq3
    label def Tfreq3 1 "Seldom or Never", modify
    label def Tfreq3 4 "Most or all of the time", modify
    label values f2f f2f_lab
    label def f2f_lab 0 "0. only processed in CAWI phase", modify
    label def f2f_lab 1 "1. processed by an interviewer", modify
    Your help would be absolutely appreciated.

    Thanks in advance, best, G.


  • #2
    Noone with some tips up?

    Thanks, G

    Comment


    • #3
      Why did you use nldecompose in the first place not the normal oaxaca-decomposition? You did not tell us what your dependent variable is.
      My guess is also that not many people are using nldecompose in their daily work and rely more on -oaxaca- or -oaxca_rif- for decompositions. Therefore, you are not getting a fast response here.
      You should probably also read the article which introduces this command. You can find the link to the article at the bottom of the help file or https://www.stata-journal.com/articl...article=st0152
      After reading the article, the negative percentages should make more sense to you. You read in the article also more about the meaning of Omega.
      Negative signs in outcome decomposition methods usually mean that one group has an advantage over another group -> e.g. characteristics which lower the observed outcome gap or that the same characteristics are valued higher for one group than for the other group.

      Comment


      • #4
        Dear Sven-Kristjan,

        thanks a lot for your answer. I actually already went through the paper you mention, but still I did not realize why I obtain some percentage which are above the 100% (while is quite clear because I obtain negative values).

        DO you have any explanation for that?

        Thanks, G

        Comment


        • #5
          The percentages above 100% indicate for example that if Omega is 1 (weighting by the coefficients of the base group) and if the coefficients were the same but the characteristics different then the observed outcome difference would be 88% larger than observed.
          The causal reason why this happens depends on your dataset and on other background knowledge about the compared groups.
          The decomposition results indicate only that there are different components which increase and decrease the outcome gap at the same time. Furthermore, the weighting of these components matters for the interpretation, as indicated by the difference between Omega=1 and Omega=0. So you might want to run the decomposition with the option "omega(neumark)" to get a somewhat more detailed analysis of the outcome gap.

          Comment

          Working...
          X