Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to compare ordered probit results to OLS

    I'm trying to figure out if there is an easy way in Stata to compare my ordered probit results with those from my ols-regressions? I see in other articles that they calculate the estimeated effect of an variable, say a dummy, and compare it to ols, and would like to do the same thing. I juse Stata 14 and I'm more or less new to practical statistics. In my dataset I use almost only factor variables and only 2 continous. And DV is on ordinal scale from 0 to 10.

    I want to do something like this, so that I can justify using ols or ordered probit; (Wooldridge 2002)



    Over the weekend I have tried several different things with margins, but I don't even come close.
    Any suggestions is highly appreciated!

  • #2
    Margins is the correct idea to compare the OLS coefficients to the results of an (ordered) logit regression since OLS coefficients are marginal effects of the variables.

    However, due to the fact that logit regressions are "inherently interactive" the marginal effect of one variable depends always on the value of the other values in the regression. Therefore, there are no "general" marginal effects in logit regressions. There are three kinds of different margins that can be computed.

    1. You analyze the marginal effect of a variable while setting the other variables to specific values. This is particularly helpful if you would like to analyze the marginal effect on a continuous variable (e.g. age) separately for men an women. Here you would set your sex variable to either 0 or 1 and compute the marginal effects of age.

    2. Marginal effects at means. Since it might be very inconvenient to set all variables to specific values, Stata can also set all variables to their respective mean when calculating the marginal effects. One disadvantage of this approach is that it may calculate marginal effects for individuals that cannot exist. Suppose the mean of your sex variable is 0.6, Stata would calculate the marginal effects for a person that is to 40 % male and to 60 % female.

    3. Average marginal effects: An approach that deals with this problem is to calculate average marginal effects. Here the marginal effects are computed for each observation, and then the average of these effects is calculated. To compare the marginal effects to the OLS regression, I would prefer this measure; although, it is typically very similar to the marginal effects at means.

    To do this calculation in Stata you have to run the (ordered) logit regression directly before the -margins- command. Since you have a ordered model you also have to tell Stata for which outcome level you want to compute the marginal effects. The marginal effects differ for each outcome as they show the marginal change of the probability that outcome x will be realized. Thus, the marginal effect of a variable for one outcome may be positive while it is negative for another.

    Let's say you would like to compute the marginal effects for the first outcome (e.g. "strongly disagree").

    This command gives you what I described in 1.:
    ologit ....
    margins, dydx(*) at(sex==0 sex==1) predict(outcome(1)) post

    here are the marginal effects at means:
    ologit ....
    margins, dydx(*) atmeans predict(outcome(1)) post

    and finally the average marginal effects:
    ologit ....
    margins, dydx(*) predict(outcome(1)) post

    You can omit the -post- option, but it is necessary if you want to create result tables (e.g. using the command -est sto-).

    I hope this was helpful.

    Best regards,
    Sebastian
    Last edited by Sebastian Geiger; 20 May 2016, 05:30.

    Comment


    • #3
      It should also be straightforward to implement the procedure that Wooldridge describes as long as the number of categories is not too many. The intuition is that the coefficient of a binary (0/1) regressor in a linear regression model with intercept only represents the difference between the value of the outcome at the regressor==0 and at the regressor==1.

      Code:
      . sysuse auto
      (1978 Automobile Data)
      
      . reg mpg foreign
      
            Source |       SS           df       MS      Number of obs   =        74
      -------------+----------------------------------   F(1, 72)        =     13.18
             Model |  378.153515         1  378.153515   Prob > F        =    0.0005
          Residual |  2065.30594        72  28.6848048   R-squared       =    0.1548
      -------------+----------------------------------   Adj R-squared   =    0.1430
             Total |  2443.45946        73  33.4720474   Root MSE        =    5.3558
      
      ------------------------------------------------------------------------------
               mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
           foreign |   4.945804   1.362162     3.63   0.001     2.230384    7.661225
             _cons |   19.82692   .7427186    26.70   0.000     18.34634    21.30751
      ------------------------------------------------------------------------------
      
      . sum mpg if foreign==1
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
               mpg |         22    24.77273    6.611187         14         41
      
      . sum mpg if foreign==0
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
               mpg |         52    19.82692    4.743297         12         34
      
      . scalar difference= 24.77273- 19.82692
      
      . di difference
      4.94581
      The Wooldridge datasets are publicly available and can be downloaded from the following link:

      http://fmwww.bc.edu/ec-p/data/wooldridge2k/

      The relevant dataset here is pension.dta. The important thing to keep in mind is that given the model y= xB + e, with three categories of the outcome (0, 1, 2), you have 2 cut offs (denote c1 and c2). The response probabilities in ordered probit are defined as:


      Prob(y=0|x) = Normprob(c1-xB)
      Prob(y=1|x) = Normprob(c2-xB) - Normprob(c1-xB)
      Prob(y=2|x) = 1 - Normprob(c2-xB)

      The book that I have (Economic analysis of cross section and panel data, 2nd Edition (Wooldridge 2012) has for choice=1, E(pctstck|x) is approximately equal to 40, and with choice=0, approx. 28.1 with a difference of 11.9 (I assume that the dataset over the 2 editions is slightly different). I derive these values below:


      Code:
      . clear
      
      . use "C:\Users\Dr\Downloads\pension.dta" 
      
      . regress pctstck choice age educ married black female finc25 finc35 finc75 finc50 finc100 finc101 wealth89 
      > prftshr
      
            Source |       SS           df       MS      Number of obs   =       194
      -------------+----------------------------------   F(14, 179)      =      1.42
             Model |  30402.0516        14  2171.57511   Prob > F        =    0.1486
          Residual |  274134.031       179  1531.47503   R-squared       =    0.0998
      -------------+----------------------------------   Adj R-squared   =    0.0294
             Total |  304536.082       193  1577.90716   Root MSE        =    39.134
      
      ------------------------------------------------------------------------------
           pctstck |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
            choice |   12.04773   6.298171     1.91   0.057    -.3804884    24.47594
               age |  -1.625967   .7748246    -2.10   0.037    -3.154932   -.0970012
              educ |   .7538685   1.207392     0.62   0.533    -1.628684    3.136422
           married |   3.303436   7.997618     0.41   0.680    -12.47831    19.08518
             black |   3.967391   9.782799     0.41   0.686    -15.33706    23.27184
            female |   1.302856   7.163775     0.18   0.856    -12.83346    15.43917
            finc25 |  -18.18567   14.12026    -1.29   0.199    -46.04925    9.677907
            finc35 |  -3.925374   14.48565    -0.27   0.787    -32.50999    24.65924
            finc75 |  -17.57921   16.07766    -1.09   0.276    -49.30534    14.14693
            finc50 |  -8.128784   14.34191    -0.57   0.572    -36.42976    20.17219
           finc100 |   -6.74559   15.79116    -0.43   0.670    -37.90637    24.41519
           finc101 |  -28.34407    17.9049    -1.58   0.115    -63.67591    6.987775
          wealth89 |  -.0026918   .0124603    -0.22   0.829    -.0272797    .0218961
           prftshr |   15.80791   7.332677     2.16   0.032     1.338299    30.27752
             _cons |   134.1161   55.70525     2.41   0.017      24.1926    244.0395
      ------------------------------------------------------------------------------
      
      . oprobit pctstck choice age educ married black female finc25 finc35 finc75 finc50 finc100 finc101 wealth89 
      > prftshr
      
      Iteration 0:   log likelihood = -212.37031  
      Iteration 1:   log likelihood =  -202.0094  
      Iteration 2:   log likelihood =  -201.9865  
      Iteration 3:   log likelihood =  -201.9865  
      
      Ordered probit regression                       Number of obs     =        194
                                                      LR chi2(14)       =      20.77
                                                      Prob > chi2       =     0.1077
      Log likelihood =  -201.9865                     Pseudo R2         =     0.0489
      
      ------------------------------------------------------------------------------
           pctstck |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
            choice |    .371171   .1841121     2.02   0.044      .010318    .7320241
               age |  -.0500516   .0226063    -2.21   0.027    -.0943591    -.005744
              educ |   .0261382   .0352561     0.74   0.458    -.0429626    .0952389
           married |   .0935981   .2332114     0.40   0.688    -.3634878     .550684
             black |   .0933923   .2820403     0.33   0.741    -.4593965    .6461811
            female |   .0455642    .206004     0.22   0.825    -.3581963    .4493246
            finc25 |  -.5784299    .423162    -1.37   0.172    -1.407812    .2509524
            finc35 |  -.1346721   .4305242    -0.31   0.754    -.9784841    .7091399
            finc75 |  -.5662312   .4780035    -1.18   0.236    -1.503101    .3706385
            finc50 |  -.2620401   .4265936    -0.61   0.539    -1.098148    .5740681
           finc100 |  -.2278963   .4685942    -0.49   0.627    -1.146324    .6905316
           finc101 |  -.8641109   .5291111    -1.63   0.102     -1.90115    .1729279
          wealth89 |  -.0000956   .0003737    -0.26   0.798    -.0008279    .0006368
           prftshr |   .4817182   .2161233     2.23   0.026     .0581243     .905312
      -------------+----------------------------------------------------------------
             /cut1 |  -3.087373   1.623765                     -6.269894    .0951479
             /cut2 |  -2.053553   1.618611                     -5.225972    1.118865
      ------------------------------------------------------------------------------
      
      *\\ First determine averages of variables and note that the combination closest to mean is person is 60 years old, has 13.5 years of education, is single, non black, male, has an income between 50k-75k (fin75), wealth of 200k, and does not have a profit sharing plan. 
      
      . sum choice age educ married black female finc25 finc35 finc75 finc50 finc100 finc101 wealth89 prftshr
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
            choice |        226    .6150442     .487665          0          1
               age |        226    60.70354    4.287002         53         73
              educ |        219    13.51598    2.554627          8         18
           married |        226    .7345133    .4425723          0          1
             black |        226     .119469    .3250596          0          1
      -------------+---------------------------------------------------------
            female |        226    .6017699      .49062          0          1
            finc25 |        216    .2083333    .4070598          0          1
            finc35 |        216    .1851852      .38935          0          1
            finc75 |        216        .125    .3314871          0          1
            finc50 |        216    .2453704    .4313061          0          1
      -------------+---------------------------------------------------------
           finc100 |        216    .1203704      .32615          0          1
           finc101 |        216    .0648148    .2467707          0          1
          wealth89 |        226    197.9057    242.0919   -579.997   1484.997
           prftshr |        206    .2087379    .4073967          0          1
      
      
      
      *\\ choice at 1
      
      . scalar xb1= (.371171*1)+ (-.0500516 *60)+(.0261382*13.5)+( -.5662312*1)+( -.0000956*200)
      
      .  scalar at10= normprob(-3.087373-(`=xb1'))
      
      .  scalar at11= normprob(-2.053553-(`=xb1')) - (`=at10')
      
      .  scalar at12= 1 - (normprob(-2.053553-(`=xb1')))
      
      . scalar at1= (0*(`=at10'))+ (50*(`=at11'))+ (100*(`=at12'))
      
      *\\ choice at 0
      
      . scalar xb0= (-.0500516 *60)+(.0261382*13.5)+( -.5662312*1)+( -.0000956*200)
      
      .  scalar at00= normprob(-3.087373-(`=xb0'))
      
      .  scalar at01= normprob(-2.053553-(`=xb0')) - (`=at00')
      
      .  scalar at02= 1 - (normprob(-2.053553-(`=xb0')))
      
      . scalar at0= (0*(`=at00'))+ (50*(`=at01'))+ (100*(`=at02'))
      
      
      . di at1
      39.84707
      
      . di at0
      27.984315
      
      . scalar diff = (`=at1') - (`=at0')
      
      . di diff
      11.862755
      
      .



      Comment

      Working...
      X