Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • coefplot: point estimates + significance levels

    Dear Stata users,

    I am using -coefplot- to vizualize average marginal effects. I am looking a way how to specify -mlabel- to plot point estimates together with significance levels. Using graph editor, I added significance levels manually to the point estimates. The graph shall in the end look like this:
    Click image for larger version

Name:	0.png
Views:	1
Size:	17.4 KB
ID:	1555929




    I managed to create a version with point estimates (see last line):

    Code:
    coefplot bigmodel_SocAct, baselevels keep(z_v2x_civlib:0.v2x_regime_01 GDPcivlib:0.v2x_regime_01 z_v2x_clpol:0.v2x_regime_01 GDPclpol:0.v2x_regime_01 z_v2x_diagacc:0.v2x_regime_01 GDPdiag:0.v2x_regime_01 ///
    z_v2x_civlib:1.v2x_regime_01 GDPcivlib:1.v2x_regime_01 z_v2x_clpol:1.v2x_regime_01 GDPclpol:1.v2x_regime_01 z_v2x_diagacc:1.v2x_regime_01 GDPdiag:1.v2x_regime_01) ///
    order(z_v2x_civlib:0.v2x_regime_01 GDPcivlib:0.v2x_regime_01 z_v2x_clpol:0.v2x_regime_01 GDPclpol:0.v2x_regime_01 z_v2x_diagacc:0.v2x_regime_01 GDPdiag:0.v2x_regime_01 ///
     z_v2x_civlib:1.v2x_regime_01 GDPcivlib:1.v2x_regime_01 z_v2x_clpol:1.v2x_regime_01 GDPclpol:1.v2x_regime_01 z_v2x_diagacc:1.v2x_regime_01 GDPdiag:1.v2x_regime_01) ///
    xtitle("Average marginal effects (AME)") xline(0) ///
    format(%9.2f) mlabposition(12) mlabgap(*2) mlabel
    Click image for larger version

Name:	1.png
Views:	1
Size:	19.8 KB
ID:	1555930




    I also managed to create another version with significance levels (see the last two lines of code):

    Code:
    coefplot bigmodel_SocAct, baselevels keep(z_v2x_civlib:0.v2x_regime_01 GDPcivlib:0.v2x_regime_01 z_v2x_clpol:0.v2x_regime_01 GDPclpol:0.v2x_regime_01 z_v2x_diagacc:0.v2x_regime_01 GDPdiag:0.v2x_regime_01 ///
    z_v2x_civlib:1.v2x_regime_01 GDPcivlib:1.v2x_regime_01 z_v2x_clpol:1.v2x_regime_01 GDPclpol:1.v2x_regime_01 z_v2x_diagacc:1.v2x_regime_01 GDPdiag:1.v2x_regime_01) ///
    order(z_v2x_civlib:0.v2x_regime_01 GDPcivlib:0.v2x_regime_01 z_v2x_clpol:0.v2x_regime_01 GDPclpol:0.v2x_regime_01 z_v2x_diagacc:0.v2x_regime_01 GDPdiag:0.v2x_regime_01 ///
     z_v2x_civlib:1.v2x_regime_01 GDPcivlib:1.v2x_regime_01 z_v2x_clpol:1.v2x_regime_01 GDPclpol:1.v2x_regime_01 z_v2x_diagacc:1.v2x_regime_01 GDPdiag:1.v2x_regime_01) ///
    xline(0) xtitle("Average marginal effects (AME)") ///
    format(%9.2f) mlabposition(12) mlabgap(*2) mlabel("p = " + string(@pval,"%9.3f")) cond(@pval<.001, "***", cond(@pval<.01, "**", cond(@pval<.05, "*",""))))
    Click image for larger version

Name:	2.png
Views:	1
Size:	17.1 KB
ID:	1555931




    The manual also offers a solution how to display p-values as well (see last line):

    Code:
    coefplot bigmodel_SocAct, baselevels keep(z_v2x_civlib:0.v2x_regime_01 GDPcivlib:0.v2x_regime_01 z_v2x_clpol:0.v2x_regime_01 GDPclpol:0.v2x_regime_01 z_v2x_diagacc:0.v2x_regime_01 GDPdiag:0.v2x_regime_01 ///
    z_v2x_civlib:1.v2x_regime_01 GDPcivlib:1.v2x_regime_01 z_v2x_clpol:1.v2x_regime_01 GDPclpol:1.v2x_regime_01 z_v2x_diagacc:1.v2x_regime_01 GDPdiag:1.v2x_regime_01) ///
    order(z_v2x_civlib:0.v2x_regime_01 GDPcivlib:0.v2x_regime_01 z_v2x_clpol:0.v2x_regime_01 GDPclpol:0.v2x_regime_01 z_v2x_diagacc:0.v2x_regime_01 GDPdiag:0.v2x_regime_01 ///
     z_v2x_civlib:1.v2x_regime_01 GDPcivlib:1.v2x_regime_01 z_v2x_clpol:1.v2x_regime_01 GDPclpol:1.v2x_regime_01 z_v2x_diagacc:1.v2x_regime_01 GDPdiag:1.v2x_regime_01) ///
    xline(0) xtitle("Average marginal effects (AME)") ///
    format(%9.2f) mlabposition(12) mlabgap(*2) mlabel("p = " + string(@pval,"%9.3f"))
    Click image for larger version

Name:	3.png
Views:	1
Size:	21.1 KB
ID:	1555932




    Can you please help me to combine point estimates and significance levels as depicted in the first graph?

    Thanks a lot!

    All the best,
    Pavel
    Last edited by Pavel Satra; 29 May 2020, 11:10.

  • #2
    Update: The point estimates stem from (multiple) margins commands like e.g.:
    Code:
    eststo modelDIAGmp: margins, dydx(c.z_v2x_diagacc) over(i.v2x_regime_01) at((mean) GDPdiag regionMY) post
    ereturn list
    Click image for larger version

Name:	4.png
Views:	1
Size:	70.0 KB
ID:	1555934

    Comment


    • #3
      Hi Pavel,

      I recently encountered a similar issue and found a solution that I hope might help!

      In addition to @pval, coefplot stores several other values (for a full list type help coefplot##tempvar into your command window). One of the available options is @b, which stores the value of point estimates. This can be used in the mlabel option to display point estimates with significance levels, as shown in the following code:

      mlabel(cond(@pval<.001, string(@b, "%9.2fc") + "***", cond(@pval<.01, string(@b, "%9.2fc") + "**", cond(@pval<.05, string(@b, "%9.2fc") + "*", string(@b, "%9.2fc")))))

      The first cond() statement says that if the p-value is <.001 the label should be the point estimate (formatted with two decimals) with 3 asterisks, the second condition says that if the p-value is <.01 the label should be the point estimate with 2 asterisks, and the third condition says that if the p-value is <.05 the label should be the point estimate with one asterisk. The third part of the final cond() statement makes it so that if the p-value is >=.05 it will just display the point estimate.

      I'm not sure if it will work exactly the same way given your estimates came from multiple margins commands and the ones I was using were just regression coefficients, but hopefully this helps point you in the right direction. Using the different stored values within coefplot should give a ton of different ways you could customize point labels to contain exactly what you're looking for.

      All the best,
      Claire
      Last edited by Claire Buehler; 25 Jan 2022, 15:00.

      Comment


      • #4
        Originally posted by Claire Buehler View Post
        Hi Pavel,

        I recently encountered a similar issue and found a solution that I hope might help!

        In addition to @pval, coefplot stores several other values (for a full list type help coefplot##tempvar into your command window). One of the available options is @b, which stores the value of point estimates. This can be used in the mlabel option to display point estimates with significance levels, as shown in the following code:

        mlabel(cond(@pval<.001, string(@b, "%9.2fc") + "***", cond(@pval<.01, string(@b, "%9.2fc") + "**", cond(@pval<.05, string(@b, "%9.2fc") + "*", string(@b, "%9.2fc")))))

        The first cond() statement says that if the p-value is <.001 the label should be the point estimate (formatted with two decimals) with 3 asterisks, the second condition says that if the p-value is <.01 the label should be the point estimate with 2 asterisks, and the third condition says that if the p-value is <.05 the label should be the point estimate with one asterisk. The third part of the final cond() statement makes it so that if the p-value is >=.05 it will just display the point estimate.

        I'm not sure if it will work exactly the same way given your estimates came from multiple margins commands and the ones I was using were just regression coefficients, but hopefully this helps point you in the right direction. Using the different stored values within coefplot should give a ton of different ways you could customize point labels to contain exactly what you're looking for.

        All the best,
        Claire
        Hi, Claire

        Your codes just perfectly solved the problem that I encountered recently.
        but there is a problem here, not sure if you know how to solve it.

        I want to plot the odds ratio result in in the figure, but when I used "cond(@pval<.01, string(@b, "%9.2fc") + "**"" as you recommanded, I found the the standard error comes from non-odds ratio command.
        which is like : oddsratio value(non-odds ratio Std)***

        do you have any idea why this happens?

        thanks you so much^^

        Comment


        • #5
          Jane Quan -

          Apparently you begin by fitting a logistic regression, and are then applying coefplot to those results.

          You will want to see the Stata FAQ that discusses the calculation of standard errors and confidence intervals for odds ratios (and similar transformations in other models) found at

          https://www.stata.com/support/faqs/s...cs/delta-rule/

          to better understand what you are doing. In particular, calculating confidence intervals for an odds ratio using +/- its standard error is usually inappropriate, since in fact the distribution of the estimated odds ratio is not symmetric. Instead, along with exponentiating the coefficient estimate to get the odds ratio, Stata exponentiates the endpoints of the confidence interval for the coefficient to get the confidence interval for the odds ratio. Note in the example below that the confidence interval for the odds ratio is highly asymmetric around the estimate.

          So displaying the standard error of the odds ratio is unlikely to be informative and is inconsistent with the results provided by the logistic command.
          Code:
          . set seed 42
          
          . set obs 100
          Number of observations (_N) was 0, now 100.
          
          . generate x = runiform(0,1)
          
          . generate y = runiform(0,1)<x
          
          . logistic y x
          
          Logistic regression                                     Number of obs =    100
                                                                  LR chi2(1)    =  32.21
                                                                  Prob > chi2   = 0.0000
          Log likelihood = -53.132066                             Pseudo R2     = 0.2326
          
          ------------------------------------------------------------------------------
                     y | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                     x |   222.3546   257.0673     4.67   0.000     23.06535    2143.543
                 _cons |   .0572468   .0375137    -4.37   0.000     .0158475    .2067957
          ------------------------------------------------------------------------------
          Note: _cons estimates baseline odds.
          Last edited by William Lisowski; 29 Jan 2022, 10:24.

          Comment


          • #6
            Originally posted by William Lisowski View Post
            Jane Quan -

            Apparently you begin by fitting a logistic regression, and are then applying coefplot to those results.

            You will want to see the Stata FAQ that discusses the calculation of standard errors and confidence intervals for odds ratios (and similar transformations in other models) found at

            https://www.stata.com/support/faqs/s...cs/delta-rule/

            to better understand what you are doing. In particular, calculating confidence intervals for an odds ratio using +/- its standard error is usually inappropriate, since in fact the distribution of the estimated odds ratio is not symmetric. Instead, along with exponentiating the coefficient estimate to get the odds ratio, Stata exponentiates the endpoints of the confidence interval for the coefficient to get the confidence interval for the odds ratio. Note in the example below that the confidence interval for the odds ratio is highly asymmetric around the estimate.

            So displaying the standard error of the odds ratio is unlikely to be informative and is inconsistent with the results provided by the logistic command.
            Code:
            . set seed 42
            
            . set obs 100
            Number of observations (_N) was 0, now 100.
            
            . generate x = runiform(0,1)
            
            . generate y = runiform(0,1)<x
            
            . logistic y x
            
            Logistic regression Number of obs = 100
            LR chi2(1) = 32.21
            Prob > chi2 = 0.0000
            Log likelihood = -53.132066 Pseudo R2 = 0.2326
            
            ------------------------------------------------------------------------------
            y | Odds ratio Std. err. z P>|z| [95% conf. interval]
            -------------+----------------------------------------------------------------
            x | 222.3546 257.0673 4.67 0.000 23.06535 2143.543
            _cons | .0572468 .0375137 -4.37 0.000 .0158475 .2067957
            ------------------------------------------------------------------------------
            Note: _cons estimates baseline odds.
            Hi William,

            Thank you so much for your detailed explanation and the referential link!

            Best

            Comment

            Working...
            X