How to calculate p value?

homa haddad

Join Date: Apr 2018

Posts: 28
#1

How to calculate p value?

17 Sep 2018, 11:35

I've run xtreg and I'm wondering how I can get the p values for the coefficients? Which command do I have to use? I've tried this:
2*(normal(-(_b[weight]/_se[weight]))) and got the error that 2 is not a valid command name!

I appreciate any help.

Last edited by homa haddad; 17 Sep 2018, 11:43.
Tags: None

1 like
Rich Goldstein

Join Date: Mar 2014

Posts: 4412
#2

17 Sep 2018, 11:45

the p values are shown in the output

if you want to grab them and use them somehow, they can be found in the returned matrix "r(table)" ; use of this has been discussed many times on this list a search might give you helpful information
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 9957
#3

17 Sep 2018, 11:45

For t-statistics

Code:

display (2 * ttail(e(df_r), abs(_b[weight]/_se[weight])))

And your code for z-statistics. Add display before the code or store it. See

Code:

help scalar help local macro
2 likes
Comment
homa haddad

Join Date: Apr 2018

Posts: 28
#4

17 Sep 2018, 12:04

Andrew Musau Thanks for your answer. I used the command and got this:
display 2*(normal(-(_b[loggdpimp]/_se[loggdpimp])))
3.293e-25

But the new problem is that I have no idea how I get the p value having this new number! Do you have any recommendation for me?
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 9957

17 Sep 2018, 12:21

Do you want just to view the number? e-25 implies that you have 25 zeros after the decimal point. It is better to view the number rounded to some specified number of decimal places, so see -help format-. Displaying 4 decimal places

Code:

. di %9.4f 3.293e-25
 0.0000

Or simply specify

Code:

di %9.4f 2*(normal(-(_b[loggdpimp]/_se[loggdpimp])))

If I have misread your question, explain in more detail what you need to do with the p-value. An example

Code:

. webuse grunfeld

. xtreg invest mvalue kstock

Random-effects GLS regression                   Number of obs     =        200
Group variable: company                         Number of groups  =         10

R-sq:                                           Obs per group:
     within  = 0.7668                                         min =         20
     between = 0.8196                                         avg =       20.0
     overall = 0.8061                                         max =         20

                                                Wald chi2(2)      =     657.67
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      invest |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      mvalue |   .1097811   .0104927    10.46   0.000     .0892159    .1303464
      kstock |    .308113   .0171805    17.93   0.000     .2744399    .3417861
       _cons |  -57.83441   28.89893    -2.00   0.045    -114.4753   -1.193537
-------------+----------------------------------------------------------------
     sigma_u |   84.20095
     sigma_e |  52.767964
         rho |  .71800838   (fraction of variance due to u_i)
------------------------------------------------------------------------------


. di %9.4f 2*(normal(-(_b[mvalue]/_se[mvalue])))
   0.0000

Comment

homa haddad

Join Date: Apr 2018
Posts: 28

17 Sep 2018, 12:43

Andrew Musau Thanks a lot! What does the %9.4f say? I need the p values of each coefficient in order to compare them with the coefficients of another regression. The coefficients of interest are loggdpimp, loggdpexp, and logdist.

Code:

. xtreg logimport loggdpimp loggdpexp logdist year_*, re
note: year_29 omitted because of collinearity

Random-effects GLS regression                   Number of obs     =      9,512
Group variable: country1                        Number of groups  =        328

R-sq:                                           Obs per group:
     within  = 0.2327                                         min =         29
     between = 0.5423                                         avg =       29.0
     overall = 0.4213                                         max =         29

                                                Wald chi2(31)     =    3165.02
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
   logimport |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   loggdpimp |   .8967918   .0864551    10.37   0.000     .7273428    1.066241
   loggdpexp |   1.192923   .0849413    14.04   0.000     1.026442    1.359405
     logdist |  -2.860353   .2783224   -10.28   0.000    -3.405855   -2.314851
      year_1 |   -2.03513   .3579769    -5.69   0.000    -2.736751   -1.333508
      year_2 |  -1.407431   .3582913    -3.93   0.000    -2.109669   -.7051927
      year_3 |  -1.154016    .354551    -3.25   0.001    -1.848923   -.4591087
      year_4 |  -1.440313   .3310608    -4.35   0.000    -2.089181   -.7914462
      year_5 |   2.399481   .3857393     6.22   0.000     1.643446    3.155516
      year_6 |   2.242283   .3772554     5.94   0.000     1.502876     2.98169
      year_7 |   2.856418   .3722422     7.67   0.000     2.126837       3.586
      year_8 |   2.516117   .3581524     7.03   0.000     1.814151    3.218083
      year_9 |   2.360187   .3491341     6.76   0.000     1.675897    3.044477
     year_10 |   2.895523   .3497545     8.28   0.000     2.210017    3.581029
     year_11 |   3.095475   .3508217     8.82   0.000     2.407877    3.783073
     year_12 |    2.95214   .3494291     8.45   0.000     2.267272    3.637009
     year_13 |   3.461128   .3497141     9.90   0.000     2.775701    4.146555
     year_14 |   3.590034   .3450769    10.40   0.000     2.913696    4.266372
     year_15 |   3.737177   .3430976    10.89   0.000     3.064718    4.409636
     year_16 |   3.480264   .3349951    10.39   0.000     2.823686    4.136842
     year_17 |    3.55967   .3268843    10.89   0.000     2.918989    4.200352
     year_18 |   3.443133   .3213533    10.71   0.000     2.813292    4.072974
     year_19 |    3.32201   .3170271    10.48   0.000     2.700648    3.943371
     year_20 |   3.273718   .3127528    10.47   0.000     2.660734    3.886703
     year_21 |   2.900323   .3112535     9.32   0.000     2.290277    3.510369
     year_22 |   2.841956   .3113724     9.13   0.000     2.231678    3.452235
     year_23 |   2.769338   .3109284     8.91   0.000      2.15993    3.378746
     year_24 |   2.258962   .3115308     7.25   0.000     1.648373    2.869551
     year_25 |   1.268525   .3118063     4.07   0.000     .6573959    1.879654
     year_26 |   1.800793   .3111129     5.79   0.000     1.191023    2.410563
     year_27 |   1.822619   .3110312     5.86   0.000     1.213009    2.432229
     year_28 |   .4513327   .3109708     1.45   0.147     -.158159    1.060824
     year_29 |          0  (omitted)
       _cons |  -18.16559   4.054175    -4.48   0.000    -26.11162   -10.21955
-------------+----------------------------------------------------------------
     sigma_u |  3.6674275
     sigma_e |   3.970777
         rho |  .46034778   (fraction of variance due to u_i)
------------------------------------------------------------------------------

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 9957

17 Sep 2018, 14:00

[QUOTE
What does the %9.4f say?
[/QUOTE]

This is explained under -help format-

The %f format

In %w.df, w is the total output width, including sign and decimal point, and d is the number of digits to appear to the right of the decimal point. The result is right-justified.

The number 5.139 in %12.2f format displays as

----+----1--
5.14

I need the p values of each coefficient in order to compare them with the coefficients of another regression. The coefficients of interest are loggdpimp, loggdpexp, and logdist.

With common variables across models, you can use esttab (Stata Journal, Ben Jann) to combine estimates, although typically, people are mostly interested in whether a variable is significant and not the specific p-value. Here is an example which you can adapt.

Code:

*TYPE findit esttab AND CLICK LINK TO INSTALL
webuse grunfeld
xtreg invest mvalue kstock
foreach var in mvalue kstock{
            estadd scalar p_`var' = 2*(normal(-(_b[`var']/_se[`var'])))
}
est sto model1
xtreg invest mvalue kstock i.year
foreach var in mvalue kstock{
             estadd scalar p_`var' = 2*(normal(-(_b[`var']/_se[`var'])))
}
est sto model2

esttab model1 model2, drop(*year) s(p_mvalue p_kstock, fmt(%9.4f))

Resulting in

Code:

. esttab model1 model2, drop(*year) s(p_mvalue p_kstock, fmt(%9.4f))

--------------------------------------------
                      (1)             (2)  
                   invest          invest  
--------------------------------------------
mvalue              0.110***        0.114***
                  (10.46)          (9.68)  

kstock              0.308***        0.354***
                  (17.93)         (15.68)  

_cons              -57.83*         -29.83  
                  (-2.00)         (-0.92)  
--------------------------------------------
p_mvalue           0.0000          0.0000  
p_kstock           0.0000          0.0000  
--------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

or without specifying a display format

Code:

. esttab model1 model2, drop(*year) s(p_mvalue p_kstock)

--------------------------------------------
                      (1)             (2)   
                   invest          invest   
--------------------------------------------
mvalue              0.110***        0.114***
                  (10.46)          (9.68)   

kstock              0.308***        0.354***
                  (17.93)         (15.68)   

_cons              -57.83*         -29.83   
                  (-2.00)         (-0.92)   
--------------------------------------------
p_mvalue         1.28e-25        3.80e-22   
p_kstock         6.41e-72        1.99e-55   
--------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Of course you could store these p-values directly into the dataset using the generate command after the relevant regression, e.g.,

Code:

*PVALUES MODEL1
xtreg y loggdpimp ...

foreach var in logimport loggdpimp loggdpexp logdist{
          gen p_`var'1= 2*(normal(-(_b[`var']/_se[`var'])))
}
*PVALUES MODEL2

xtreg y loggdpimp ...

foreach var in logimport loggdpimp loggdpexp logdist{
          gen p_`var'2= 2*(normal(-(_b[`var']/_se[`var'])))
}

*COMPARE PAIRED PVALUES
foreach var in logimport loggdpimp loggdpexp logdist{
       compare  p_`var'1  p_`var'2
}

Last edited by Andrew Musau; 17 Sep 2018, 14:15.

Comment

homa haddad

Join Date: Apr 2018
Posts: 28

18 Sep 2018, 05:55

Andrew Musau Dear Andrew, thousands of thanks for your help! I ran the command and got this result:

Code:

. esttab model1, drop(year_*) s(p_loggdpimp p_loggdpexp p_logdist, fmt(%9.4f))

----------------------------
(1)
logimport
----------------------------
loggdpimp 0.897***
(10.37)

loggdpexp 1.193***
(14.04)

logdist -2.860***
(-10.28)

_cons -18.17***
(-4.48)
----------------------------
p_loggdpimp 0.0000
p_loggdpexp 0.0000
p_logdist 2.0000
----------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

As you can see, the p values are equal to the results from my xtreg regression. I'm now a little bit confused. Does it mean, that the coefficient from xtreg are actually p values?! It doesn't make sense to me.

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 9957

18 Sep 2018, 07:54

So, let us consider what the regression output in Stata includes

Code:

. webuse grunfeld

. xtreg invest mvalue kstock

Random-effects GLS regression                   Number of obs     =        200
Group variable: company                         Number of groups  =         10

R-sq:                                           Obs per group:
     within  = 0.7668                                         min =         20
     between = 0.8196                                         avg =       20.0
     overall = 0.8061                                         max =         20

                                                Wald chi2(2)      =     657.67
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      invest |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      mvalue |   .1097811   .0104927    10.46   0.000     .0892159    .1303464
      kstock |    .308113   .0171805    17.93   0.000     .2744399    .3417861
       _cons |  -57.83441   28.89893    -2.00   0.045    -114.4753   -1.193537
-------------+----------------------------------------------------------------
     sigma_u |   84.20095
     sigma_e |  52.767964
         rho |  .71800838   (fraction of variance due to u_i)
------------------------------------------------------------------------------

In red, we have the coefficients; blue, standard errors; orange, z-statistics; and green the p-values. So yes, the p-values that you calculate are already displayed in the regression table. To my point in #7, when presenting the results, most people are interested in whether the coefficient of a variable is significant and not necessarily the actual p-value. The conventional levels of significance are 0.001, 0.01 and 0.05 (sometimes 0.1). Therefore, in the output of esttab, the number of stars is what indicates the level of significance (usually 3 stars for 0.001, 2 stars for 0.01 and 1 star for 0.05, but you can change the defaults). The sizes of the z-statistics will tell you "how significant" one coefficient is relative to another. Therefore, including the actual p-values in your presentation of the results is not necessary. So if your goal is for people to compare your coefficients and levels of significance across models, just presenting the output of esttab with the defaults is sufficient, e.g.,

Code:

eststo: qui reg invest mvalue kstock
eststo: qui reg invest mvalue kstock i.year
esttab, s(N r2) drop(*year)

Code:

. esttab, s(N r2) drop(*year)

--------------------------------------------
                      (1)             (2)   
                   invest          invest   
--------------------------------------------
mvalue              0.116***        0.117***
                  (19.80)         (18.45)   

kstock              0.231***        0.220***
                   (9.05)          (6.80)   

_cons              -42.71***       -23.57   
                  (-4.49)         (-0.75)   
--------------------------------------------
N                     200             200   
r2                  0.812           0.817   
--------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

So here we see that the coefficients of mvalue and kstock are "more significant" in the first model relative to the second (19.80> 18.45 and 9.05> 6.80, respectively), but all the coefficients are significant at the 0.001 level (all have 3 stars). I do not need the specific p-values to make this comparison.

Comment

homa haddad

Join Date: Apr 2018

Posts: 28
#10

18 Sep 2018, 11:59

Andrew Musau Thank you sooooo much! I finally understood it. Best, Homa
Comment
Federico Tedeschi

Join Date: Mar 2015

Posts: 137
#11

04 Jul 2023, 04:24

Originally posted by homa haddad View Post

2*(normal(-(_b[weight]/_se[weight]))) .

To make it generalizable to the case where the coefficient may have either sign:

Code:

di 2*(normal(-abs(_b[TAS20_total_score]/_se[TAS20_total_score])))

(https://www.google.com/url?sa=t&rct=...F&opi=89978449), or

Code:

di 2*(1-(normal(abs(_b[TAS20_total_score]/_se[TAS20_total_score]))))
1 like
Comment
Mukta Mukherjee

Join Date: Jan 2023

Posts: 6
#12

29 Jan 2024, 08:28

Hi! Andrew and others I am trying to run DOLS and want to find out the effect for each cross-section. I want to compute p-value automatically for each cross-section. I am currently using the following command but it is giving for the entire model. not individual cross-section.

*PVALUES MODEL1
xtcointreg modprice1 perish_days stringency perish_STR, xtrend(1) est(dols) dic(aic) full

foreach var in perish_days stringency perish_STR{
gen p_`var'1= 2*(normal(-(_b[`var']/_se[`var'])))
}
est sto model1
esttab model1 , s(p_perish_days p_perish_STR, fmt(%9.4f))

Any suggestions regarding this would be helpful.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 9957
#13

29 Jan 2024, 13:43

Your question refers to a community-contributed command (xtcointreg from SSC), which I do not use. Start a new thread that indicates the name of the command so that those who use it may help you. If you do not get a satisfactory response, contact the authors of the command.
Comment
Mukta Mukherjee

Join Date: Jan 2023

Posts: 6
#14

30 Jan 2024, 18:06

I would highly appreciate if you can help with two things
1) How to store the estimation results for each cross-section?
2) How to create loop for each cross-section?

Any suggestion in this regard would be helpful.
Comment

Announcement