Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to calculate p value?

    I've run xtreg and I'm wondering how I can get the p values for the coefficients? Which command do I have to use? I've tried this:
    2*(normal(-(_b[weight]/_se[weight]))) and got the error that 2 is not a valid command name!

    I appreciate any help.
    Last edited by homa haddad; 17 Sep 2018, 11:43.

  • #2
    the p values are shown in the output

    if you want to grab them and use them somehow, they can be found in the returned matrix "r(table)" ; use of this has been discussed many times on this list a search might give you helpful information

    Comment


    • #3
      For t-statistics

      Code:
      display (2 * ttail(e(df_r), abs(_b[weight]/_se[weight])))
      And your code for z-statistics. Add display before the code or store it. See

      Code:
      help scalar
      help local macro

      Comment


      • #4
        Andrew Musau Thanks for your answer. I used the command and got this:
        display 2*(normal(-(_b[loggdpimp]/_se[loggdpimp])))
        3.293e-25

        But the new problem is that I have no idea how I get the p value having this new number! Do you have any recommendation for me?

        Comment


        • #5
          Do you want just to view the number? e-25 implies that you have 25 zeros after the decimal point. It is better to view the number rounded to some specified number of decimal places, so see -help format-. Displaying 4 decimal places

          Code:
          . di %9.4f 3.293e-25
           0.0000
          Or simply specify

          Code:
          di %9.4f 2*(normal(-(_b[loggdpimp]/_se[loggdpimp])))
          If I have misread your question, explain in more detail what you need to do with the p-value. An example

          Code:
          . webuse grunfeld
          
          . xtreg invest mvalue kstock
          
          Random-effects GLS regression                   Number of obs     =        200
          Group variable: company                         Number of groups  =         10
          
          R-sq:                                           Obs per group:
               within  = 0.7668                                         min =         20
               between = 0.8196                                         avg =       20.0
               overall = 0.8061                                         max =         20
          
                                                          Wald chi2(2)      =     657.67
          corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
          
          ------------------------------------------------------------------------------
                invest |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                mvalue |   .1097811   .0104927    10.46   0.000     .0892159    .1303464
                kstock |    .308113   .0171805    17.93   0.000     .2744399    .3417861
                 _cons |  -57.83441   28.89893    -2.00   0.045    -114.4753   -1.193537
          -------------+----------------------------------------------------------------
               sigma_u |   84.20095
               sigma_e |  52.767964
                   rho |  .71800838   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          
          
          . di %9.4f 2*(normal(-(_b[mvalue]/_se[mvalue])))
             0.0000

          Comment


          • #6
            Andrew Musau Thanks a lot! What does the %9.4f say? I need the p values of each coefficient in order to compare them with the coefficients of another regression. The coefficients of interest are loggdpimp, loggdpexp, and logdist.
            Code:
            . xtreg logimport loggdpimp loggdpexp logdist year_*, re
            note: year_29 omitted because of collinearity
            
            Random-effects GLS regression                   Number of obs     =      9,512
            Group variable: country1                        Number of groups  =        328
            
            R-sq:                                           Obs per group:
                 within  = 0.2327                                         min =         29
                 between = 0.5423                                         avg =       29.0
                 overall = 0.4213                                         max =         29
            
                                                            Wald chi2(31)     =    3165.02
            corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
            
            ------------------------------------------------------------------------------
               logimport |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
               loggdpimp |   .8967918   .0864551    10.37   0.000     .7273428    1.066241
               loggdpexp |   1.192923   .0849413    14.04   0.000     1.026442    1.359405
                 logdist |  -2.860353   .2783224   -10.28   0.000    -3.405855   -2.314851
                  year_1 |   -2.03513   .3579769    -5.69   0.000    -2.736751   -1.333508
                  year_2 |  -1.407431   .3582913    -3.93   0.000    -2.109669   -.7051927
                  year_3 |  -1.154016    .354551    -3.25   0.001    -1.848923   -.4591087
                  year_4 |  -1.440313   .3310608    -4.35   0.000    -2.089181   -.7914462
                  year_5 |   2.399481   .3857393     6.22   0.000     1.643446    3.155516
                  year_6 |   2.242283   .3772554     5.94   0.000     1.502876     2.98169
                  year_7 |   2.856418   .3722422     7.67   0.000     2.126837       3.586
                  year_8 |   2.516117   .3581524     7.03   0.000     1.814151    3.218083
                  year_9 |   2.360187   .3491341     6.76   0.000     1.675897    3.044477
                 year_10 |   2.895523   .3497545     8.28   0.000     2.210017    3.581029
                 year_11 |   3.095475   .3508217     8.82   0.000     2.407877    3.783073
                 year_12 |    2.95214   .3494291     8.45   0.000     2.267272    3.637009
                 year_13 |   3.461128   .3497141     9.90   0.000     2.775701    4.146555
                 year_14 |   3.590034   .3450769    10.40   0.000     2.913696    4.266372
                 year_15 |   3.737177   .3430976    10.89   0.000     3.064718    4.409636
                 year_16 |   3.480264   .3349951    10.39   0.000     2.823686    4.136842
                 year_17 |    3.55967   .3268843    10.89   0.000     2.918989    4.200352
                 year_18 |   3.443133   .3213533    10.71   0.000     2.813292    4.072974
                 year_19 |    3.32201   .3170271    10.48   0.000     2.700648    3.943371
                 year_20 |   3.273718   .3127528    10.47   0.000     2.660734    3.886703
                 year_21 |   2.900323   .3112535     9.32   0.000     2.290277    3.510369
                 year_22 |   2.841956   .3113724     9.13   0.000     2.231678    3.452235
                 year_23 |   2.769338   .3109284     8.91   0.000      2.15993    3.378746
                 year_24 |   2.258962   .3115308     7.25   0.000     1.648373    2.869551
                 year_25 |   1.268525   .3118063     4.07   0.000     .6573959    1.879654
                 year_26 |   1.800793   .3111129     5.79   0.000     1.191023    2.410563
                 year_27 |   1.822619   .3110312     5.86   0.000     1.213009    2.432229
                 year_28 |   .4513327   .3109708     1.45   0.147     -.158159    1.060824
                 year_29 |          0  (omitted)
                   _cons |  -18.16559   4.054175    -4.48   0.000    -26.11162   -10.21955
            -------------+----------------------------------------------------------------
                 sigma_u |  3.6674275
                 sigma_e |   3.970777
                     rho |  .46034778   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------

            Comment


            • #7
              [QUOTE
              What does the %9.4f say?
              [/QUOTE]

              This is explained under -help format-


              The %f format

              In %w.df, w is the total output width, including sign and decimal point, and d is the number of digits to appear to the right of the decimal point. The result is right-justified.

              The number 5.139 in %12.2f format displays as

              ----+----1--
              5.14


              I need the p values of each coefficient in order to compare them with the coefficients of another regression. The coefficients of interest are loggdpimp, loggdpexp, and logdist.

              With common variables across models, you can use esttab (Stata Journal, Ben Jann) to combine estimates, although typically, people are mostly interested in whether a variable is significant and not the specific p-value. Here is an example which you can adapt.

              Code:
              *TYPE findit esttab AND CLICK LINK TO INSTALL
              webuse grunfeld
              xtreg invest mvalue kstock
              foreach var in mvalue kstock{
                          estadd scalar p_`var' = 2*(normal(-(_b[`var']/_se[`var'])))
              }
              est sto model1
              xtreg invest mvalue kstock i.year
              foreach var in mvalue kstock{
                           estadd scalar p_`var' = 2*(normal(-(_b[`var']/_se[`var'])))
              }
              est sto model2
              
              esttab model1 model2, drop(*year) s(p_mvalue p_kstock, fmt(%9.4f))
              Resulting in

              Code:
              . esttab model1 model2, drop(*year) s(p_mvalue p_kstock, fmt(%9.4f))
              
              --------------------------------------------
                                    (1)             (2)  
                                 invest          invest  
              --------------------------------------------
              mvalue              0.110***        0.114***
                                (10.46)          (9.68)  
              
              kstock              0.308***        0.354***
                                (17.93)         (15.68)  
              
              _cons              -57.83*         -29.83  
                                (-2.00)         (-0.92)  
              --------------------------------------------
              p_mvalue           0.0000          0.0000  
              p_kstock           0.0000          0.0000  
              --------------------------------------------
              t statistics in parentheses
              * p<0.05, ** p<0.01, *** p<0.001
              or without specifying a display format

              Code:
              . esttab model1 model2, drop(*year) s(p_mvalue p_kstock)
              
              --------------------------------------------
                                    (1)             (2)   
                                 invest          invest   
              --------------------------------------------
              mvalue              0.110***        0.114***
                                (10.46)          (9.68)   
              
              kstock              0.308***        0.354***
                                (17.93)         (15.68)   
              
              _cons              -57.83*         -29.83   
                                (-2.00)         (-0.92)   
              --------------------------------------------
              p_mvalue         1.28e-25        3.80e-22   
              p_kstock         6.41e-72        1.99e-55   
              --------------------------------------------
              t statistics in parentheses
              * p<0.05, ** p<0.01, *** p<0.001

              Of course you could store these p-values directly into the dataset using the generate command after the relevant regression, e.g.,

              Code:
              *PVALUES MODEL1
              xtreg y loggdpimp ...
              
              foreach var in logimport loggdpimp loggdpexp logdist{
                        gen p_`var'1= 2*(normal(-(_b[`var']/_se[`var'])))
              }
              *PVALUES MODEL2
              
              xtreg y loggdpimp ...
              
              foreach var in logimport loggdpimp loggdpexp logdist{
                        gen p_`var'2= 2*(normal(-(_b[`var']/_se[`var'])))
              }
              
              *COMPARE PAIRED PVALUES
              foreach var in logimport loggdpimp loggdpexp logdist{
                     compare  p_`var'1  p_`var'2
              }
              Last edited by Andrew Musau; 17 Sep 2018, 14:15.

              Comment


              • #8
                Andrew Musau Dear Andrew, thousands of thanks for your help! I ran the command and got this result:

                Code:
                . esttab model1, drop(year_*) s(p_loggdpimp p_loggdpexp p_logdist, fmt(%9.4f))
                
                ----------------------------
                (1)
                logimport
                ----------------------------
                loggdpimp 0.897***
                (10.37)
                
                loggdpexp 1.193***
                (14.04)
                
                logdist -2.860***
                (-10.28)
                
                _cons -18.17***
                (-4.48)
                ----------------------------
                p_loggdpimp 0.0000
                p_loggdpexp 0.0000
                p_logdist 2.0000
                ----------------------------
                t statistics in parentheses
                * p<0.05, ** p<0.01, *** p<0.001
                As you can see, the p values are equal to the results from my xtreg regression. I'm now a little bit confused. Does it mean, that the coefficient from xtreg are actually p values?! It doesn't make sense to me.

                Comment


                • #9
                  So, let us consider what the regression output in Stata includes


                  Code:
                  . webuse grunfeld
                  
                  . xtreg invest mvalue kstock
                  
                  Random-effects GLS regression                   Number of obs     =        200
                  Group variable: company                         Number of groups  =         10
                  
                  R-sq:                                           Obs per group:
                       within  = 0.7668                                         min =         20
                       between = 0.8196                                         avg =       20.0
                       overall = 0.8061                                         max =         20
                  
                                                                  Wald chi2(2)      =     657.67
                  corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
                  
                  ------------------------------------------------------------------------------
                        invest |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                        mvalue |   .1097811   .0104927    10.46   0.000     .0892159    .1303464
                        kstock |    .308113   .0171805    17.93   0.000     .2744399    .3417861
                         _cons |  -57.83441   28.89893    -2.00   0.045    -114.4753   -1.193537
                  -------------+----------------------------------------------------------------
                       sigma_u |   84.20095
                       sigma_e |  52.767964
                           rho |  .71800838   (fraction of variance due to u_i)
                  ------------------------------------------------------------------------------
                  In red, we have the coefficients; blue, standard errors; orange, z-statistics; and green the p-values. So yes, the p-values that you calculate are already displayed in the regression table. To my point in #7, when presenting the results, most people are interested in whether the coefficient of a variable is significant and not necessarily the actual p-value. The conventional levels of significance are 0.001, 0.01 and 0.05 (sometimes 0.1). Therefore, in the output of esttab, the number of stars is what indicates the level of significance (usually 3 stars for 0.001, 2 stars for 0.01 and 1 star for 0.05, but you can change the defaults). The sizes of the z-statistics will tell you "how significant" one coefficient is relative to another. Therefore, including the actual p-values in your presentation of the results is not necessary. So if your goal is for people to compare your coefficients and levels of significance across models, just presenting the output of esttab with the defaults is sufficient, e.g.,

                  Code:
                  eststo: qui reg invest mvalue kstock
                  eststo: qui reg invest mvalue kstock i.year
                  esttab, s(N r2) drop(*year)
                  Code:
                  . esttab, s(N r2) drop(*year)
                  
                  --------------------------------------------
                                        (1)             (2)   
                                     invest          invest   
                  --------------------------------------------
                  mvalue              0.116***        0.117***
                                    (19.80)         (18.45)   
                  
                  kstock              0.231***        0.220***
                                     (9.05)          (6.80)   
                  
                  _cons              -42.71***       -23.57   
                                    (-4.49)         (-0.75)   
                  --------------------------------------------
                  N                     200             200   
                  r2                  0.812           0.817   
                  --------------------------------------------
                  t statistics in parentheses
                  * p<0.05, ** p<0.01, *** p<0.001
                  So here we see that the coefficients of mvalue and kstock are "more significant" in the first model relative to the second (19.80> 18.45 and 9.05> 6.80, respectively), but all the coefficients are significant at the 0.001 level (all have 3 stars). I do not need the specific p-values to make this comparison.

                  Comment


                  • #10
                    Andrew Musau Thank you sooooo much! I finally understood it. Best, Homa

                    Comment


                    • #11
                      Originally posted by homa haddad View Post
                      2*(normal(-(_b[weight]/_se[weight]))) .
                      To make it generalizable to the case where the coefficient may have either sign:
                      Code:
                      di 2*(normal(-abs(_b[TAS20_total_score]/_se[TAS20_total_score])))
                      (https://www.google.com/url?sa=t&rct=...F&opi=89978449), or
                      Code:
                       di 2*(1-(normal(abs(_b[TAS20_total_score]/_se[TAS20_total_score]))))

                      Comment


                      • #12
                        Hi! Andrew and others I am trying to run DOLS and want to find out the effect for each cross-section. I want to compute p-value automatically for each cross-section. I am currently using the following command but it is giving for the entire model. not individual cross-section.

                        *PVALUES MODEL1
                        xtcointreg modprice1 perish_days stringency perish_STR, xtrend(1) est(dols) dic(aic) full

                        foreach var in perish_days stringency perish_STR{
                        gen p_`var'1= 2*(normal(-(_b[`var']/_se[`var'])))
                        }
                        est sto model1
                        esttab model1 , s(p_perish_days p_perish_STR, fmt(%9.4f))

                        Any suggestions regarding this would be helpful.

                        Comment


                        • #13
                          Your question refers to a community-contributed command (xtcointreg from SSC), which I do not use. Start a new thread that indicates the name of the command so that those who use it may help you. If you do not get a satisfactory response, contact the authors of the command.

                          Comment


                          • #14
                            I would highly appreciate if you can help with two things
                            1) How to store the estimation results for each cross-section?
                            2) How to create loop for each cross-section?

                            Any suggestion in this regard would be helpful.

                            Comment

                            Working...
                            X