Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a table of several F-test results using Stata Command

    Dear Statalist colleauges,

    I am trying to automate a process of making tables in Stata because manual job is error-prone.

    So I have several different F-Test results and want to create a table including the F-Statistics and Significance(*).

    Code:
    tsset date, delta(7 days)
    
    **** RealGDP
    newey RealGDP_t8  L8.RealGDP  L8.sp500Ret L8.REALPCE_monthly L8.Mom_monthly L8.PCAall  L8.entropy  L8.text_1  L8.text_2 , lag(6) force
    
    *text
    test L8.PCAall  L8.entropy  L8.text_1  L8.text_2
    *nontext
    test L8.sp500Ret L8.REALPCE_monthly L8.Mom_monthly
    *all
    test L8.sp500Ret L8.REALPCE_monthly L8.Mom_monthly L8.PCAall  L8.entropy  L8.text_1  L8.text_2
    
    **** Dep_2
    newey Dep_2_t8  L8.Dep_2  L8.basis  L8.Mom_monthly L8.VIX L8.PCAall  L8.entropy  L8.text_1  L8.text_3  , lag(6) force
    
    *text
    test L8.PCAall  L8.entropy  L8.text_1  L8.text_3
    *nontext
    test L8.basis  L8.Mom_monthly L8.VIX_Thu
    *all
    test L8.basis  L8.Mom_monthly L8.VIX L8.PCAall  L8.entropy  L8.text_1  L8.text_3
    
    **** Dep_3
    newey Dep_3_t8  L8.Dep_3 L8.Dep_2  L8.Vol MV_Sales L8.VIX L8.entropy  L8.text_2 L8.text_3 , lag(6) force
    
    *text
    test L8.entropy  L8.text_2L8.text_3
    *nontext
    test L8.Dep_2  L8.Vol MV_Sales L8.VIX_Thu
    *all
    test L8.Dep_2  L8.Vol MV_Sales L8.VIX L8.entropy  L8.text_2L8.text_3
    All tests give me F-statistics and P-value.
    With these 'test' results I am trying to make a table as pasted below.. I can just create manually, but really want to know if there is an efficient way of automating this process through codes.
    Could you pelase help me manage this? Thanks a lot! -Jinny
    Specification RealGDP Dep_2 Dep_3
    text 11.21*** 8.93*** 5.40***
    non-text 4.23*** 8.23*** 39.91***
    all 4.91*** 6.68*** 19.93***
    Last edited by Jinny Koh; 16 Jul 2021, 09:58.

  • #2
    It would be easier to give you good advice if we know what version of Stata you are using.
    Richard T. Campbell
    Emeritus Professor of Biostatistics and Sociology
    University of Illinois at Chicago

    Comment


    • #3
      Hi Dick Campbell ,

      thank you for the reply.
      I am using the most recent Stata 17 MP.
      Sorry for not providing this info.

      Thanks!
      Jinny

      Comment


      • #4



        Here I pasted a cleaner table for better understanding.
        I am trying to formulate this kind of table via code.

        The table is populated by Fstatistics and the significance(*)

        Thanks much!
        Attached Files

        Comment


        • #5
          Given that you are in Stata 17 you can use the new version of table, which I have not had the time to learn. It is enormously flexible and will produce exactly the table you show including the stars. You can also use putexcel to produce a table of this kind, but with a little less flexibility. If you do this kind of work a lot, where you want to routinely produce a complex table I think the time and effort to learn the new version of table and its associated collect commands is probably worth the effort. There are numerous videos available on the State YouTube channel and a series of entries on the Stata Blog.
          Richard T. Campbell
          Emeritus Professor of Biostatistics and Sociology
          University of Illinois at Chicago

          Comment


          • #6
            Thank you Dick Campbell .

            I tried to use the table() function. But I don't think it supports test() command.. Maybe it's just for regress() commands. Please let me know if I am understanding wrong.
            Code:
            . table () () (), command(test L8.PCA L8.entropy L8.text_1 L8.text_2)
            command test exited with an error when run using the entire sample
            r(111);
            Thanks!

            Comment


            • #7
              So as you can see in this result,
              I am trying to create tables using F statistic and the p value in this result.
              I want to make it 6.02 *** as a table slot.

              If this is not feasible in stata, should I try another software like R?
              Thanks!
              Click image for larger version

Name:	Capture.JPG
Views:	1
Size:	20.5 KB
ID:	1619265

              Comment


              • #8
                Here is a very rough version of how to do this using putexcel.
                Code:
                . use http://www.stata-press.com/data/r16/nhanes2.dta
                
                .
                . * define labels for table
                . putexcel set rtc1.xlsx, replace
                Note: File will be replaced when the first putexcel command is issued.
                
                . putexcel B1 = `"BP"'
                file rtc1.xlsx saved
                
                . putexcel C1 = `"TC"'
                file rtc1.xlsx saved
                
                . putexcel A2 = `"F"'
                file rtc1.xlsx saved
                
                . putexcel A3 = `"p"'
                file rtc1.xlsx saved
                
                .
                . * run regression and test commands
                . reg bpsystol age weight sex
                
                      Source |       SS           df       MS      Number of obs   =    10,351
                -------------+----------------------------------   F(3, 10347)     =   1501.75
                       Model |   1709209.9         3  569736.633   Prob > F        =    0.0000
                    Residual |  3925460.13    10,347  379.381476   R-squared       =    0.3033
                -------------+----------------------------------   Adj R-squared   =    0.3031
                       Total |  5634670.03    10,350  544.412563   Root MSE        =    19.478
                ------------------------------------------------------------------------------
                    bpsystol | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                         age |   .6374325   .0111334    57.25   0.000     .6156088    .6592562
                      weight |   .4170339    .013474    30.95   0.000     .3906221    .4434456
                         sex |   .8244702   .4140342     1.99   0.046     .0128832    1.636057
                       _cons |   69.31168    1.43283    48.37   0.000     66.50305     72.1203
                ------------------------------------------------------------------------------
                
                . test age weight
                
                 ( 1)  age = 0
                 ( 2)  weight = 0
                
                       F(  2, 10347) = 2202.99
                            Prob > F =    0.0000
                
                .
                . *fill in F statistics and p values for the two dep. variables
                . putexcel B2 = `r(F)'
                file rtc1.xlsx saved
                
                . putexcel B3 = `r(p)'
                file rtc1.xlsx saved
                
                .
                . * run second depvar
                . reg tc age weight sex
                
                      Source |       SS           df       MS      Number of obs   =    10,351
                -------------+----------------------------------   F(3, 10347)     =    711.79
                       Model |  4318576.56         3  1439525.52   Prob > F        =    0.0000
                    Residual |  20925799.1    10,347  2022.40254   R-squared       =    0.1711
                -------------+----------------------------------   Adj R-squared   =    0.1708
                       Total |  25244375.7    10,350  2439.07011   Root MSE        =    44.971
                
                ------------------------------------------------------------------------------
                    tcresult | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                         age |    1.12673   .0257054    43.83   0.000     1.076342    1.177118
                      weight |   .2792676   .0311095     8.98   0.000     .2182869    .3402483
                         sex |    11.4645   .9559428    11.99   0.000     9.590663    13.33833
                       _cons |   126.4964   3.308191    38.24   0.000     120.0117    132.9811
                ------------------------------------------------------------------------------
                
                . test age weight
                
                 ( 1)  age = 0
                 ( 2)  weight = 0
                
                       F(  2, 10347) = 1020.90
                            Prob > F =    0.0000
                
                . putexcel C2 = `r(F)'
                file rtc1.xlsx saved
                
                . putexcel C3 = `r(p)'
                file rtc1.xlsx saved
                Here is a copy of the resulting Excel file. You can make it look much nicer with a few easy formatting tweaks.
                BP TC
                F 2202.991 1020.902
                p 0 0

                However, I have read through the table manual and it looks to me like you can get it to accept the results from the test command.
                See the first example under table hypothesis tests in the V17 manual. that example uses a t test but I am pretty sure you could
                get it to run with test.

                Richard T. Campbell
                Emeritus Professor of Biostatistics and Sociology
                University of Illinois at Chicago

                Comment


                • #9
                  Thank you very much Dick Campbell for your suggestion.

                  I managed to make one slot of the table I attahced above as follows:

                  Code:
                  . quietly: collect r(F) r(p) : test L8.PCAall_Fri L8.entropy_Fri L8.text_1 L8.text_2
                  
                  . collect layout () (result)
                  
                  Collection: ex5
                     Columns: result
                     Table 1: 1 x 2
                  
                  -----------------------------
                  F statistic Two-sided p-value
                  -----------------------------
                     10.38506          2.92e-08
                  -----------------------------
                  But I have two questions.

                  1. Do you think making stars(*) here is a manual job necessarily by looking at the p-values? I don't see an option for making stars in this table except for the reg() result case.

                  2. Moreover, I cannot find how I can actually merge all F-stat and P-values from all different results in a single table. The collect function allows me to create just one table. But I want to have them in one table.

                  Thanks much again,
                  Jinny

                  Comment


                  • #10
                    I would do something similar to Dick's suggestion, but the fact that you want significance stars makes estout from SSC more efficient. I do not have version 17 to suggest a table solution. As you are repeating the same set of commands below, you can loop, but I illustrate without looping so that you can follow the code.

                    Code:
                    webuse idle2, clear
                    tsset time
                    gen usr2= usr^2
                    eststo m1: newey usr idle syslcl wio , lag(3)
                    test idle
                    mat t1= r(F)
                    mat p1= r(p)
                    test syslcl
                    mat t2= r(F)
                    mat p2= r(p)
                    test wio
                    mat t3= r(F)
                    mat p3= r(p)
                    mat m1= t1,t2,t3
                    mat colnames m1= text notext all
                    mat pval1=p1,p2,p3
                    mat colnames pval1= text notext all
                    estadd matrix m= m1
                    estadd matrix pval= pval1
                    eststo m2: newey usr2 idle syslcl wio , lag(3)
                    test idle
                    mat t1= r(F)
                    mat p1= r(p)
                    test syslcl
                    mat t2= r(F)
                    mat p2= r(p)
                    test wio
                    mat t3= r(F)
                    mat p3= r(p)
                    mat m2= t1,t2,t3
                    mat colnames m2= text notext all
                    mat pval2=p1,p2,p3
                    mat colnames pval2= text notext all
                    estadd matrix m= m2
                    estadd matrix pval= pval2
                    esttab m1 m2, cells(m(star pvalue(pval))) collab(none) ///
                    mlab("Real GDP" "Dep_2") nonumb coeflab(notext "non-text") ///
                    note( "Some text here")
                    Res.:

                    Code:
                    
                    . esttab m1 m2, cells(m(star pvalue(pval))) collab(none) ///
                    > mlab("Real GDP" "Dep_2") nonumb coeflab(notext "non-text") ///
                    > note( "Some text here")
                    
                    --------------------------------------------
                                     Real GDP           Dep_2   
                    --------------------------------------------
                    text             868.5323***     27.14372***
                    non-text         360.9546***     16.05246***
                    all              945.1146***     21.61509***
                    --------------------------------------------
                    N                      30              30   
                    --------------------------------------------
                    Some text here
                    Last edited by Andrew Musau; 16 Jul 2021, 16:47.

                    Comment


                    • #11
                      Thank you so much Andrew Musau ! This is really informative! I didn't know about these codes at all.

                      May I please have a follow-up?

                      1. How can I get rid of the row "N" for total observation? I don't really want that.

                      2. You said we can loop the above. But if I have different predictors(indep. variables) for every dep. variable, it would not work. Am I right?
                      I have 8 dependent variables, but the predictors are different everytime! I see the example you showed have same predictors. Moreover, the variables for test () would also differ each time.
                      Then the only way would be to enumerate for each case. Please correct me if I am thinking wrong.

                      Thank you so much again,
                      Jinny

                      Comment


                      • #12
                        3. In addition,
                        I tried to develop your codes so that I can export the table to word document, but it returns nothing. Could you please tell me what I am doing wrong here?

                        Code:
                        . esttab m1 m2 using "TITLE", rtf replace , cells(m(star pvalue(pval))) collab(none) ///
                        > mlab("FutRet" "DSpot") nonumb coeflab(notext "non-text") ///
                        > note( "Some text here") ///
                        Thank you again,
                        Jinny

                        Comment


                        • #13
                          1. How can I get rid of the row "N" for total observation? I don't really want that.
                          You want the option -noobs-

                          2. You said we can loop the above. But if I have different predictors(indep. variables) for every dep. variable, it would not work. Am I right?
                          I have 8 dependent variables, co the predictors are different everytime! I see the example you showed have same predictors. Moreover, the variables for test () would also differ each time.
                          Then the only way would be to enumerate for each case. Please correct me if I am thinking wrong.
                          You can define the regression command and test variables corresponding to each iteration of the loop (see below). I have variables named varname and varname2, but the names can be completely different.

                          3. In addition,
                          I tried to develop your codes so that I can export the table to word document, but it returns nothing. Could you please tell me what I am doing wrong here?
                          I call the output file myfile.rtf below, which you can open in MS Word.

                          Code:
                          webuse idle2, clear
                          tsset time
                          foreach var in usr idle syslcl wio{
                              gen `var'2= (`var'-0.5*`var')^2
                          }
                          
                          *START HERE
                          local varlist1 usr idle syslcl wio, lag(3)
                          local testv1_1 idle
                          local testv1_2 syslcl
                          local testv1_3 wio
                          
                          local varlist2 usr2 idle2 syslcl2 wio2, lag(2)
                          local testv2_1 idle2
                          local testv2_2 syslcl2
                          local testv2_3 wio2
                          
                          forval i=1/2{
                              capture noisily{
                                  eststo m`i': newey `varlist`i''
                                  test `testv`i'_1'
                                  mat t1= r(F)
                                  mat p1= r(p)
                                  test `testv`i'_2'
                                  mat t2= r(F)
                                  mat p2= r(p)
                                  test `testv`i'_3'
                                  mat t3= r(F)
                                  mat p3= r(p)
                                  mat m= t1,t2,t3
                                  mat colnames m= text notext all
                                  mat pval=p1,p2,p3
                                  mat colnames pval= text notext all
                                  estadd matrix m= m
                                  estadd matrix pval= pval
                              }
                          }
                          esttab m* using myfile.rtf, replace cells(m(star pvalue(pval))) ///
                          collab(none) mlab("Real GDP" "Dep_2") nonumb noobs ///
                          coeflab(notext "non-text") note( "Some text here")
                          Res.:

                          Code:
                          . esttab m* using myfile.rtf, replace cells(m(star pvalue(pval))) ///
                          > collab(none) mlab("Real GDP" "Dep_2") nonumb noobs ///
                          > coeflab(notext "non-text") note( "Some text here")
                          
                          --------------------------------------------
                                           Real GDP           Dep_2  
                          --------------------------------------------
                          text             868.5323***      10.3364**
                          non-text         360.9546***     .0788595  
                          all              945.1146***     5.426446*  
                          --------------------------------------------
                          Some text here

                          ADDED IN EDIT: I have added -capture- to the code above in case you have an uneven number of tests across models. This will prevent the code from terminating with an error in case some models do not have some tests. However, you also need to modify the creation of the matrices in this case and define the additional tests.
                          Last edited by Andrew Musau; 16 Jul 2021, 18:29.

                          Comment


                          • #14
                            Thank you very much Andrew Musau !

                            This is really helpful.

                            One another follow-up :

                            the thing is that your code assumes that one 'test' performs Ftest on one variable.

                            However, it's not the case.
                            It's very irregular across dfifferent dependent variables.

                            So basically it is
                            Code:
                            *test1
                            test x1 x2 x3 ..
                            
                            *test2
                            test x4 x5 ...
                            
                            *test3 (all variables)
                            test x1 x2 x3 x4 x5 ....
                            And moreover, the number of variables in each test is also not consistent across different models(dependent variables).
                            In this case, I guess the looping would be very noisy. Am I correct? because we cannot just say
                            test `testv`i'_1' because there is not just one variable. Do you think there is a feasible way? If not, no worries! I can just create codes for each variable as you first described! Thanks, Jinny
                            Last edited by Jinny Koh; 16 Jul 2021, 20:41.

                            Comment


                            • #15
                              The issue is not how many variables are in the list, but the total number of tests. As long as you have 3 tests after each estimation, you can drop -capture- as there will be no errors from an uneven number of tests across models. So you can have something like:

                              Code:
                              local varlist1 y1 x1 x2 x3 x4 x5, lag(3)
                              local testv1_1 x1 x2
                              local testv1_2 x1 x3 x5
                              local testv1_3 x1 x2 x3 x4 x5
                              
                              local varlist2 y2 x6 x7 x8 x9 x10, lag(2)
                              local testv2_1 x7 x8 x9 x10
                              local testv2_2 x6 x7 x8
                              local testv2_3 x6 x7 x8 x9 x10
                              
                              local varlist3 y3 x11 x12 x13 x14 x15, lag(2)
                              local testv3_1 x11 x12
                              local testv3_2 x12 x13 x14
                              local testv3_3 x11 x12 x13 x14 x15
                              
                              forval i=1/3{
                                  eststo m`i': newey `varlist`i''
                                  test `testv`i'_1'
                                  mat t1= r(F)
                                  mat p1= r(p)
                                  test `testv`i'_2'
                                  mat t2= r(F)
                                  mat p2= r(p)
                                  test `testv`i'_3'
                                  mat t3= r(F)
                                  mat p3= r(p)
                                  mat m= t1,t2,t3
                                  mat colnames m= text notext all
                                  mat pval=p1,p2,p3
                                  mat colnames pval= text notext all
                                  estadd matrix m= m
                                  estadd matrix pval= pval
                              }
                              
                              esttab m* using myfile.rtf, replace cells(m(star pvalue(pval))) ///
                              collab(none) mlab("Real GDP" "Dep_2") nonumb noobs ///
                              coeflab(notext "non-text") note( "Some text here")
                              Last edited by Andrew Musau; 17 Jul 2021, 05:33.

                              Comment

                              Working...
                              X