Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How does "prchange" calculate marginal effects?

    Dear Statalisters:

    I'm not sure if this is a Stata question or a calculus question, but how does Stata calculate the marginal effects ("Margefct", last column) in "prchange"?

    Here is some output from a logit model for the probability of having lived abroad with one continuous regressor (Inc1K, income in thousands / month) and one binary regressor, having family members abroad or no (FamAbroad):


    LivedAbroad Coef. Std. Err. z P>z [95% Conf. Interval]

    FamAbroad 1.391555 .1624504 8.57 0.000 1.073158 1.709952
    Inc1K .0021335 .0187338 0.11 0.909 -.0345841 .0388511
    _cons -2.777322 .1520807 -18.26 0.000 -3.075394 -2.479249


    And here is the output from "prchange":

    logit: Changes in Probabilities for LivedAbroad

    min->max 0->1 -+1/2 -+sd/2 MargEfct
    FamAbroad 0.1424 0.1424 0.1418 0.0693 0.1375
    Inc1K 0.0043 0.0002 0.0002 0.0008 0.0002

    0 1
    Pr(y|x) 0.8889 0.1111

    FamAbroad Inc1K
    x= .495948 3.73314
    sd_x= .500119 3.57373

    The question is, how does "prchange" calculate the last column, described in the help file as the "partial derivative of the predicted probability or predicted rate with respect to the
    independent variables." I've tried a couple of possibilities to calculate this by hand:

    1) f(x+1) - f(x)

    This seems to work for the continuous regressor, Inc1K:

    . display [1 / (1+exp(-(1.391555*.495948 + 4.37314*.0021335 - 2.777322)))] - [1 / (1+exp(-(1.391555*.495948 + 3.37314*.0021335 - 2.777322)))]
    .0002108

    but not for the binary regressor:

    . display [1 / (1+exp(-(1.391555*1.495948 + 3.37314*.0021335 - 2.777322)))] - [1 / (1+exp(-(1.391555*.495948 + 3.37314*.0021335 - 2.777322)))]
    .22332479

    2) The partial derivative of logit_-1(x), or exp(x) / (1+exp(x)), that is:

    exp(x) / (exp(x)+1) - exp(2x) / (exp(x)+1)^2


    . display exp(.0021355*3.37314) / (exp(3.37314*.0021335) + 1) - exp(2*.0021355*3.37314) / (exp(.0021355*3.37314) +1)^2
    .24999846

    All to no avail.

    Somebody, throw me a bone here, please!

    Thanks,
    David
    Web site:
    ​http://investigadores.cide.edu/crow/


    Las Américas y el Mundo:
    http://lasamericasyelmundo.cide.edu/

    ==========================================
    David Crow
    Associate Professor, División de Estudios Internacionales
    Centro de Investigación y Docencia Económicas (CIDE)
    ==========================================

  • #2
    First a few posting tips. (1) Say where user-written routines come from -- I assume you are using Long & Freese's prchange routine that is part of their spost9 package which Long keeps on his own site; use -findit spost9-. There are often old and outdated versions of Long and Freese programs on different sites, so with their stuff it is especially important to tell where you got it from (or better yet, get the latest version from Long's site).

    (2) Your post is hard to read. Learn to use the fancy new advanced text features; in particular, code. For example,

    Code:
    . sysuse auto
    (1978 Automobile Data)
    
    . logit foreign weight mpg, nolog
    
    Logistic regression                               Number of obs   =         74
                                                      LR chi2(2)      =      35.72
                                                      Prob > chi2     =     0.0000
    Log likelihood = -27.175156                       Pseudo R2       =     0.3966
    
    ------------------------------------------------------------------------------
         foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          weight |  -.0039067   .0010116    -3.86   0.000    -.0058894    -.001924
             mpg |  -.1685869   .0919175    -1.83   0.067    -.3487418     .011568
           _cons |   13.70837   4.518709     3.03   0.002     4.851859    22.56487
    ------------------------------------------------------------------------------
    Now, getting back to what you really want to know, which is the answer to your question... I think prchange behaves the same way as the old mfx command does. Continuing the above,

    Code:
    . prchange
    
    logit: Changes in Probabilities for foreign
    
            min->max      0->1     -+1/2    -+sd/2  MargEfct
    weight   -0.9622   -0.0000   -0.0005   -0.4208   -0.0005
       mpg   -0.4656   -0.0201   -0.0224   -0.1303   -0.0224
    
             Domestic   Foreign
    Pr(y|x)    0.8427    0.1573
    
            weight      mpg
       x=  3019.46  21.2973
    sd_x=  777.194   5.7855
    
    . mfx
    
    Marginal effects after logit
          y  = Pr(foreign) (predict)
             =  .15733364
    ------------------------------------------------------------------------------
    variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
    ---------+--------------------------------------------------------------------
      weight |  -.0005179      .00014   -3.73   0.000   -.00079 -.000246   3019.46
         mpg |  -.0223512       .0127   -1.76   0.079   -.04725  .002548   21.2973
    ------------------------------------------------------------------------------
    Since you may not know how the old mfx things did things either, I'll see if I can explain it later (if somebody doesn't beat me to it). You can also look at this for do it yourself examples:

    http://www3.nd.edu/~rwilliam/stats3/Margins02.pdf

    Having said all that, unless you are trapped with a primitive version of Stata you probably want to use the margins command. Also Long anf Freese's spost13 commands, currently in beta, will be much better than the spost9 commands once they are officially released.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 18.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      OK, first off, there are different formulas for binary and continuous independent variables. For binary,

      Marginal Effect Xk = Pr(Y = 1|X, Xk = 1) – Pr(y=1|X, Xk = 0)

      For all the other Xs besides the dichotomous explanatory variable being examined, the default practice of prchange and mfx is to set them equal to the means. So, in your case, the calculation should be

      Code:
      . display [1 / (1+exp(-(1.391555*1 + 3.73314*.0021335 - 2.777322)))] - [1 / (1+exp(-(1.391555*0 + 3.73314*.0
      > 021335 - 2.777322)))]
      .1423595
      That is what prchange reported as the change from 0 to 1. I think the Margefct reported by prchange assumes that all variables are continuous, but usually you do not treat dichotomies as continuous when computing marginal effects.

      If you want to try this with your own problem, type

      logit ...[whatever]
      prchange
      mfx
      mfx, nodiscrete

      for the binary variable, the first should give you the same value prchange gave you for the 0-1 change. The 2nd should give you the value prchange reported for the marginal effect. For example,

      Code:
      . webuse lbw, clear
      (Hosmer & Lemeshow data)
      
      . logit low smoke age, nolog
      
      Logistic regression                               Number of obs   =        189
                                                        LR chi2(2)      =       7.40
                                                        Prob > chi2     =     0.0248
      Log likelihood = -113.63815                       Pseudo R2       =     0.0315
      
      ------------------------------------------------------------------------------
               low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
             smoke |   .6918486   .3218061     2.15   0.032     .0611202    1.322577
               age |  -.0497792    .031972    -1.56   0.119    -.1124431    .0128846
             _cons |   .0609051   .7573199     0.08   0.936    -1.423415    1.545225
      ------------------------------------------------------------------------------
      
      . prchange
      
      logit: Changes in Probabilities for low
      
             min->max      0->1     -+1/2    -+sd/2  MargEfct
      smoke    0.1498    0.1498    0.1458    0.0716    0.1466
        age   -0.2805   -0.0122   -0.0105   -0.0558   -0.0105
      
                    0       1
      Pr(y|x)  0.6953  0.3047
      
               smoke      age
         x=  .391534  23.2381
      sd_x=   .48939  5.29868
      
      . mfx
      
      Marginal effects after logit
            y  = Pr(low) (predict)
               =  .30470601
      ------------------------------------------------------------------------------
      variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
      ---------+--------------------------------------------------------------------
         smoke*|    .149832      .07018    2.14   0.033   .012288  .287376   .391534
           age |  -.0105462      .00673   -1.57   0.117  -.023746  .002653   23.2381
      ------------------------------------------------------------------------------
      (*) dy/dx is for discrete change of dummy variable from 0 to 1
      
      . mfx, nodiscrete
      
      Marginal effects after logit
            y  = Pr(low) (predict)
               =  .30470601
      ------------------------------------------------------------------------------
      variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
      ---------+--------------------------------------------------------------------
         smoke |   .1465752      .06772    2.16   0.030   .013847  .279303   .391534
           age |  -.0105462      .00673   -1.57   0.117  -.023746  .002653   23.2381
      ------------------------------------------------------------------------------
      
      .
      Continuous marginal effects are a little trickier. You basically upped the value of the continuous variable by 1, and in this case it worked well. But it won't always. If I remain ambitious I will takes a crack at it but if not the formulas are explained in the handout I posted earlier.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 18.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        My most heartfelt thanks for your answer, your posting tips, and for wading through the post to get to the answer! The document you linked is very helpful, and I'll be sure to make my subsequent posts more legible.

        Re: margins, it's a wonderful command. I continue to like some of the old Long and Freese spost_ado routines, though, because they do some things more quickly and easily than in margins--including the "prchange" output, which would require several lines of "margins," I think. I'm not even sure, for example, one would get +/- .5 sd in margins. I tried this:

        logit y x
        sum x
        local sd = r(sd)
        local mean = r(mean)
        local msdneg = `mean' - .5*`sd'
        local msdpos = `mean' + .5*`sd'
        margins, dydx(x) at(x=(`msdneg' `msdpos')

        Does "margins" even work with local variables?

        At any rate, thanks again for the help.

        Best of regards,
        David

        Web site:
        ​http://investigadores.cide.edu/crow/


        Las Américas y el Mundo:
        http://lasamericasyelmundo.cide.edu/

        ==========================================
        David Crow
        Associate Professor, División de Estudios Internacionales
        Centro de Investigación y Docencia Económicas (CIDE)
        ==========================================

        Comment


        • #5
          Not to abuse your kindness, but I'm noticing in the handout you linked that you said you believe "mfx" (and, presumably, "prchange" and "margins") may compute dy/dx numerically, plugging in smaller and smaller values until changes in the slope (tangent line to the sigmoidal curve) become trivial.

          Is there a closed form solution for this?

          Web site:
          ​http://investigadores.cide.edu/crow/


          Las Américas y el Mundo:
          http://lasamericasyelmundo.cide.edu/

          ==========================================
          David Crow
          Associate Professor, División de Estudios Internacionales
          Centro de Investigación y Docencia Económicas (CIDE)
          ==========================================

          Comment


          • #6
            I'm not sure what you mean by "closed form" but take a look at the user-written -margeff-. The help says "margeff compute analytically estimates partial effects after estimation." I think that means that it uses formulas rather than brute force techniques, which means it can run more quickly -- provided you are using a technique it has formulas for.

            If you look at http://www.stata.com/bookstore/micro...ata/index.html section 10.6 discusses marginal effects, and shows some code for do it yourself marginal effects. My handout also more or less shows how to do it.

            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            StataNow Version: 18.5 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment


            • #7
              To answer the question about whether margins works with local variables...Sure. I am not sure that your code is correct, but it does work with locals.

              Code:
              webuse lbw, clear
              local y low
              local x age
              logit `y' `x'
              sum `x'
              local sd = r(sd)
              local mean = r(mean)
              local msdneg = `mean' - .5*`sd'
              local msdpos = `mean' + .5*`sd'
              margins, dydx(`x') at(`x'=(`msdneg' `msdpos'))
              In spost13, the mchange command will do the sorts of things prchange did. I give some examples here (but the command may have changed since I wrote this handout).

              http://www3.nd.edu/~rwilliam/stats3/Margins04.pdf

              In general spost13 is better than spost9 because the programs use the margins command. This makes the commands both more flexible and more powerful.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 18.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment


              • #8
                Incidentally, here is code that approximates the reported marginal effect for the continuous variable (prchange reports it as .0002). I added .01 to the mean in the c2 calculations.

                Code:
                . scalar logoddsc1 = 1.391555 * .495948 + .0021335 * 3.73314 - 2.777322
                
                . scalar oddsc1 = exp(logoddsc1)
                
                . scalar pc1 = oddsc1 / (1 + oddsc1)
                
                . scalar logoddsc2 = 1.391555 * .495948 + .0021335 * 3.74314 - 2.777322
                
                . scalar oddsc2 = exp(logoddsc2)
                
                . scalar pc2 = oddsc2 / (1 + oddsc2)
                
                . di "Marginal effect = " (pc2 - pc1)/.01
                Marginal effect = .00021075
                For the dichotomous variable prchange said the marginal effect was .1375. Again, prchange is treating the variable as though it were continuous in that column. To approximate (note that I add .001 to the mean in the 2nd set of calculations)

                Code:
                . scalar logoddsc1 = 1.391555 * .495948 + .0021335 * 3.73314 - 2.777322
                
                . scalar oddsc1 = exp(logoddsc1)
                
                . scalar pc1 = oddsc1 / (1 + oddsc1)
                
                . scalar logoddsc2 = 1.391555 * .496948 + .0021335 * 3.73314 - 2.777322
                
                . scalar oddsc2 = exp(logoddsc2)
                
                . scalar pc2 = oddsc2 / (1 + oddsc2)
                
                . di "Marginal effect = " (pc2 - pc1)/.001
                Marginal effect = .13753578
                I'm sure Stata is doing this better than I am, but this is the general idea.Add a small number to X, then divide the change in y by the change in x (hence the terminology dy/dx).
                -------------------------------------------
                Richard Williams, Notre Dame Dept of Sociology
                StataNow Version: 18.5 MP (2 processor)

                EMAIL: [email protected]
                WWW: https://www3.nd.edu/~rwilliam

                Comment

                Working...
                X