Creating a table of several F-test results using Stata Command

Jinny Koh

Join Date: Jul 2021
Posts: 33

Creating a table of several F-test results using Stata Command

16 Jul 2021, 09:52

Dear Statalist colleauges,

I am trying to automate a process of making tables in Stata because manual job is error-prone.

So I have several different F-Test results and want to create a table including the F-Statistics and Significance(*).

Code:

tsset date, delta(7 days)

**** RealGDP
newey RealGDP_t8  L8.RealGDP  L8.sp500Ret L8.REALPCE_monthly L8.Mom_monthly L8.PCAall  L8.entropy  L8.text_1  L8.text_2 , lag(6) force

*text
test L8.PCAall  L8.entropy  L8.text_1  L8.text_2
*nontext
test L8.sp500Ret L8.REALPCE_monthly L8.Mom_monthly
*all
test L8.sp500Ret L8.REALPCE_monthly L8.Mom_monthly L8.PCAall  L8.entropy  L8.text_1  L8.text_2

**** Dep_2
newey Dep_2_t8  L8.Dep_2  L8.basis  L8.Mom_monthly L8.VIX L8.PCAall  L8.entropy  L8.text_1  L8.text_3  , lag(6) force

*text
test L8.PCAall  L8.entropy  L8.text_1  L8.text_3
*nontext
test L8.basis  L8.Mom_monthly L8.VIX_Thu
*all
test L8.basis  L8.Mom_monthly L8.VIX L8.PCAall  L8.entropy  L8.text_1  L8.text_3

**** Dep_3
newey Dep_3_t8  L8.Dep_3 L8.Dep_2  L8.Vol MV_Sales L8.VIX L8.entropy  L8.text_2 L8.text_3 , lag(6) force

*text
test L8.entropy  L8.text_2L8.text_3
*nontext
test L8.Dep_2  L8.Vol MV_Sales L8.VIX_Thu
*all
test L8.Dep_2  L8.Vol MV_Sales L8.VIX L8.entropy  L8.text_2L8.text_3

All tests give me F-statistics and P-value.
With these 'test' results I am trying to make a table as pasted below.. I can just create manually, but really want to know if there is an efficient way of automating this process through codes.
Could you pelase help me manage this? Thanks a lot! -Jinny

Specification	RealGDP	Dep_2	Dep_3
text	11.21***	8.93***	5.40***
non-text	4.23***	8.23***	39.91***
all	4.91***	6.68***	19.93***

Last edited by Jinny Koh; 16 Jul 2021, 09:58.

Tags: None

Dick Campbell

Join Date: Apr 2014

Posts: 279
#2

16 Jul 2021, 10:44

It would be easier to give you good advice if we know what version of Stata you are using.

Richard T. Campbell
Emeritus Professor of Biostatistics and Sociology
University of Illinois at Chicago
Comment
Jinny Koh

Join Date: Jul 2021

Posts: 33
#3

16 Jul 2021, 10:55

Hi Dick Campbell ,

thank you for the reply.
I am using the most recent Stata 17 MP.
Sorry for not providing this info.

Thanks!
Jinny
Comment
Jinny Koh

Join Date: Jul 2021

Posts: 33
#4

16 Jul 2021, 13:44

Here I pasted a cleaner table for better understanding.
I am trying to formulate this kind of table via code.

The table is populated by Fstatistics and the significance(*)

Thanks much!
Attached Files
Comment
Dick Campbell

Join Date: Apr 2014

Posts: 279
#5

16 Jul 2021, 14:02

Given that you are in Stata 17 you can use the new version of table, which I have not had the time to learn. It is enormously flexible and will produce exactly the table you show including the stars. You can also use putexcel to produce a table of this kind, but with a little less flexibility. If you do this kind of work a lot, where you want to routinely produce a complex table I think the time and effort to learn the new version of table and its associated collect commands is probably worth the effort. There are numerous videos available on the State YouTube channel and a series of entries on the Stata Blog.

Richard T. Campbell
Emeritus Professor of Biostatistics and Sociology
University of Illinois at Chicago
Comment
Jinny Koh

Join Date: Jul 2021

Posts: 33
#6

16 Jul 2021, 14:55

Thank you Dick Campbell .

I tried to use the table() function. But I don't think it supports test() command.. Maybe it's just for regress() commands. Please let me know if I am understanding wrong.

Code:

. table () () (), command(test L8.PCA L8.entropy L8.text_1 L8.text_2) command test exited with an error when run using the entire sample r(111);

Thanks!
Comment
Jinny Koh

Join Date: Jul 2021

Posts: 33
#7

16 Jul 2021, 15:31

So as you can see in this result,
I am trying to create tables using F statistic and the p value in this result.
I want to make it 6.02 *** as a table slot.

If this is not feasible in stata, should I try another software like R?
Thanks!
Comment

Dick Campbell

Join Date: Apr 2014
Posts: 279

16 Jul 2021, 16:04

Here is a very rough version of how to do this using putexcel.

Code:

. use http://www.stata-press.com/data/r16/nhanes2.dta

.
. * define labels for table
. putexcel set rtc1.xlsx, replace
Note: File will be replaced when the first putexcel command is issued.

. putexcel B1 = `"BP"'
file rtc1.xlsx saved

. putexcel C1 = `"TC"'
file rtc1.xlsx saved

. putexcel A2 = `"F"'
file rtc1.xlsx saved

. putexcel A3 = `"p"'
file rtc1.xlsx saved

.
. * run regression and test commands
. reg bpsystol age weight sex

      Source |       SS           df       MS      Number of obs   =    10,351
-------------+----------------------------------   F(3, 10347)     =   1501.75
       Model |   1709209.9         3  569736.633   Prob > F        =    0.0000
    Residual |  3925460.13    10,347  379.381476   R-squared       =    0.3033
-------------+----------------------------------   Adj R-squared   =    0.3031
       Total |  5634670.03    10,350  544.412563   Root MSE        =    19.478
------------------------------------------------------------------------------
    bpsystol | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .6374325   .0111334    57.25   0.000     .6156088    .6592562
      weight |   .4170339    .013474    30.95   0.000     .3906221    .4434456
         sex |   .8244702   .4140342     1.99   0.046     .0128832    1.636057
       _cons |   69.31168    1.43283    48.37   0.000     66.50305     72.1203
------------------------------------------------------------------------------

. test age weight

 ( 1)  age = 0
 ( 2)  weight = 0

       F(  2, 10347) = 2202.99
            Prob > F =    0.0000

.
. *fill in F statistics and p values for the two dep. variables
. putexcel B2 = `r(F)'
file rtc1.xlsx saved

. putexcel B3 = `r(p)'
file rtc1.xlsx saved

.
. * run second depvar
. reg tc age weight sex

      Source |       SS           df       MS      Number of obs   =    10,351
-------------+----------------------------------   F(3, 10347)     =    711.79
       Model |  4318576.56         3  1439525.52   Prob > F        =    0.0000
    Residual |  20925799.1    10,347  2022.40254   R-squared       =    0.1711
-------------+----------------------------------   Adj R-squared   =    0.1708
       Total |  25244375.7    10,350  2439.07011   Root MSE        =    44.971

------------------------------------------------------------------------------
    tcresult | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |    1.12673   .0257054    43.83   0.000     1.076342    1.177118
      weight |   .2792676   .0311095     8.98   0.000     .2182869    .3402483
         sex |    11.4645   .9559428    11.99   0.000     9.590663    13.33833
       _cons |   126.4964   3.308191    38.24   0.000     120.0117    132.9811
------------------------------------------------------------------------------

. test age weight

 ( 1)  age = 0
 ( 2)  weight = 0

       F(  2, 10347) = 1020.90
            Prob > F =    0.0000

. putexcel C2 = `r(F)'
file rtc1.xlsx saved

. putexcel C3 = `r(p)'
file rtc1.xlsx saved

Here is a copy of the resulting Excel file. You can make it look much nicer with a few easy formatting tweaks.

	BP	TC
F	2202.991	1020.902
p	0	0

However, I have read through the table manual and it looks to me like you can get it to accept the results from the test command.
See the first example under table hypothesis tests in the V17 manual. that example uses a t test but I am pretty sure you could
get it to run with test.

Richard T. Campbell
Emeritus Professor of Biostatistics and Sociology
University of Illinois at Chicago

Comment

Jinny Koh

Join Date: Jul 2021

Posts: 33
#9

16 Jul 2021, 16:18

Thank you very much Dick Campbell for your suggestion.

I managed to make one slot of the table I attahced above as follows:

Code:

. quietly: collect r(F) r(p) : test L8.PCAall_Fri L8.entropy_Fri L8.text_1 L8.text_2 . collect layout () (result) Collection: ex5 Columns: result Table 1: 1 x 2 ----------------------------- F statistic Two-sided p-value ----------------------------- 10.38506 2.92e-08 -----------------------------

But I have two questions.

1. Do you think making stars(*) here is a manual job necessarily by looking at the p-values? I don't see an option for making stars in this table except for the reg() result case.

2. Moreover, I cannot find how I can actually merge all F-stat and P-values from all different results in a single table. The collect function allows me to create just one table. But I want to have them in one table.

Thanks much again,
Jinny
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10084

#10

16 Jul 2021, 16:43

I would do something similar to Dick's suggestion, but the fact that you want significance stars makes estout from SSC more efficient. I do not have version 17 to suggest a table solution. As you are repeating the same set of commands below, you can loop, but I illustrate without looping so that you can follow the code.

Code:

webuse idle2, clear
tsset time
gen usr2= usr^2
eststo m1: newey usr idle syslcl wio , lag(3)
test idle
mat t1= r(F)
mat p1= r(p)
test syslcl
mat t2= r(F)
mat p2= r(p)
test wio
mat t3= r(F)
mat p3= r(p)
mat m1= t1,t2,t3
mat colnames m1= text notext all
mat pval1=p1,p2,p3
mat colnames pval1= text notext all
estadd matrix m= m1
estadd matrix pval= pval1
eststo m2: newey usr2 idle syslcl wio , lag(3)
test idle
mat t1= r(F)
mat p1= r(p)
test syslcl
mat t2= r(F)
mat p2= r(p)
test wio
mat t3= r(F)
mat p3= r(p)
mat m2= t1,t2,t3
mat colnames m2= text notext all
mat pval2=p1,p2,p3
mat colnames pval2= text notext all
estadd matrix m= m2
estadd matrix pval= pval2
esttab m1 m2, cells(m(star pvalue(pval))) collab(none) ///
mlab("Real GDP" "Dep_2") nonumb coeflab(notext "non-text") ///
note( "Some text here")

Res.:

Code:


. esttab m1 m2, cells(m(star pvalue(pval))) collab(none) ///
> mlab("Real GDP" "Dep_2") nonumb coeflab(notext "non-text") ///
> note( "Some text here")

--------------------------------------------
                 Real GDP           Dep_2   
--------------------------------------------
text             868.5323***     27.14372***
non-text         360.9546***     16.05246***
all              945.1146***     21.61509***
--------------------------------------------
N                      30              30   
--------------------------------------------
Some text here

Last edited by Andrew Musau; 16 Jul 2021, 16:47.

Comment

Jinny Koh

Join Date: Jul 2021

Posts: 33
#11

16 Jul 2021, 17:24

Thank you so much Andrew Musau ! This is really informative! I didn't know about these codes at all.

May I please have a follow-up?

1. How can I get rid of the row "N" for total observation? I don't really want that.

2. You said we can loop the above. But if I have different predictors(indep. variables) for every dep. variable, it would not work. Am I right?
I have 8 dependent variables, but the predictors are different everytime! I see the example you showed have same predictors. Moreover, the variables for test () would also differ each time.
Then the only way would be to enumerate for each case. Please correct me if I am thinking wrong.

Thank you so much again,
Jinny
Comment
Jinny Koh

Join Date: Jul 2021

Posts: 33
#12

16 Jul 2021, 17:45

3. In addition,
I tried to develop your codes so that I can export the table to word document, but it returns nothing. Could you please tell me what I am doing wrong here?

Code:

. esttab m1 m2 using "TITLE", rtf replace , cells(m(star pvalue(pval))) collab(none) /// > mlab("FutRet" "DSpot") nonumb coeflab(notext "non-text") /// > note( "Some text here") ///

Thank you again,
Jinny
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10084

#13

16 Jul 2021, 18:09

1. How can I get rid of the row "N" for total observation? I don't really want that.

You want the option -noobs-

2. You said we can loop the above. But if I have different predictors(indep. variables) for every dep. variable, it would not work. Am I right?
I have 8 dependent variables, co the predictors are different everytime! I see the example you showed have same predictors. Moreover, the variables for test () would also differ each time.
Then the only way would be to enumerate for each case. Please correct me if I am thinking wrong.

You can define the regression command and test variables corresponding to each iteration of the loop (see below). I have variables named varname and varname2, but the names can be completely different.

3. In addition,
I tried to develop your codes so that I can export the table to word document, but it returns nothing. Could you please tell me what I am doing wrong here?

I call the output file myfile.rtf below, which you can open in MS Word.

Code:

webuse idle2, clear
tsset time
foreach var in usr idle syslcl wio{
    gen `var'2= (`var'-0.5*`var')^2
}

*START HERE
local varlist1 usr idle syslcl wio, lag(3)
local testv1_1 idle
local testv1_2 syslcl
local testv1_3 wio

local varlist2 usr2 idle2 syslcl2 wio2, lag(2)
local testv2_1 idle2
local testv2_2 syslcl2
local testv2_3 wio2

forval i=1/2{
    capture noisily{
        eststo m`i': newey `varlist`i''
        test `testv`i'_1'
        mat t1= r(F)
        mat p1= r(p)
        test `testv`i'_2'
        mat t2= r(F)
        mat p2= r(p)
        test `testv`i'_3'
        mat t3= r(F)
        mat p3= r(p)
        mat m= t1,t2,t3
        mat colnames m= text notext all
        mat pval=p1,p2,p3
        mat colnames pval= text notext all
        estadd matrix m= m
        estadd matrix pval= pval
    }
}
esttab m* using myfile.rtf, replace cells(m(star pvalue(pval))) ///
collab(none) mlab("Real GDP" "Dep_2") nonumb noobs ///
coeflab(notext "non-text") note( "Some text here")

Res.:

Code:

. esttab m* using myfile.rtf, replace cells(m(star pvalue(pval))) ///
> collab(none) mlab("Real GDP" "Dep_2") nonumb noobs ///
> coeflab(notext "non-text") note( "Some text here")

--------------------------------------------
                 Real GDP           Dep_2  
--------------------------------------------
text             868.5323***      10.3364**
non-text         360.9546***     .0788595  
all              945.1146***     5.426446*  
--------------------------------------------
Some text here

ADDED IN EDIT: I have added -capture- to the code above in case you have an uneven number of tests across models. This will prevent the code from terminating with an error in case some models do not have some tests. However, you also need to modify the creation of the matrices in this case and define the additional tests.

Last edited by Andrew Musau; 16 Jul 2021, 18:29.

Comment

Jinny Koh

Join Date: Jul 2021

Posts: 33
#14

16 Jul 2021, 20:37

Thank you very much Andrew Musau !

This is really helpful.

One another follow-up :

the thing is that your code assumes that one 'test' performs Ftest on one variable.

However, it's not the case.
It's very irregular across dfifferent dependent variables.

So basically it is

Code:

*test1 test x1 x2 x3 .. *test2 test x4 x5 ... *test3 (all variables) test x1 x2 x3 x4 x5 ....

And moreover, the number of variables in each test is also not consistent across different models(dependent variables).
In this case, I guess the looping would be very noisy. Am I correct? because we cannot just say
test `testv`i'_1' because there is not just one variable. Do you think there is a feasible way? If not, no worries! I can just create codes for each variable as you first described! Thanks, Jinny

Last edited by Jinny Koh; 16 Jul 2021, 20:41.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10084

#15

17 Jul 2021, 05:30

The issue is not how many variables are in the list, but the total number of tests. As long as you have 3 tests after each estimation, you can drop -capture- as there will be no errors from an uneven number of tests across models. So you can have something like:

Code:

local varlist1 y1 x1 x2 x3 x4 x5, lag(3)
local testv1_1 x1 x2
local testv1_2 x1 x3 x5
local testv1_3 x1 x2 x3 x4 x5

local varlist2 y2 x6 x7 x8 x9 x10, lag(2)
local testv2_1 x7 x8 x9 x10
local testv2_2 x6 x7 x8
local testv2_3 x6 x7 x8 x9 x10

local varlist3 y3 x11 x12 x13 x14 x15, lag(2)
local testv3_1 x11 x12
local testv3_2 x12 x13 x14
local testv3_3 x11 x12 x13 x14 x15

forval i=1/3{
    eststo m`i': newey `varlist`i''
    test `testv`i'_1'
    mat t1= r(F)
    mat p1= r(p)
    test `testv`i'_2'
    mat t2= r(F)
    mat p2= r(p)
    test `testv`i'_3'
    mat t3= r(F)
    mat p3= r(p)
    mat m= t1,t2,t3
    mat colnames m= text notext all
    mat pval=p1,p2,p3
    mat colnames pval= text notext all
    estadd matrix m= m
    estadd matrix pval= pval
}

esttab m* using myfile.rtf, replace cells(m(star pvalue(pval))) ///
collab(none) mlab("Real GDP" "Dep_2") nonumb noobs ///
coeflab(notext "non-text") note( "Some text here")

Last edited by Andrew Musau; 17 Jul 2021, 05:33.

Announcement

Creating a table of several F-test results using Stata Command

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment