Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ANOVA p-value decimal places

    Hi all.
    Would be very grateful if anyone could tell me how to increase the number of decimal places shown for an ANOVA p-value (Prob > F) on stata. I currently have 4, i.e. 0.0000. Previous threads have mentioned using .return list but this doesn't seem to work on ANOVA...

    Ed

  • #2
    You could do something like this:

    Code:
    clear
    sysuse auto
    anova mpg rep78
    *ereturn list
    local pModel = Ftail(e(df_m),e(df_r),e(F))
    display "Model p = " `pModel'
    Output: Model p = .00162691

    You may have to tweak the code a bit, depending on how many terms you have in your model and which p-value you want, etc. HTH.
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

    Comment


    • #3
      The output of help anova shows us that the anova command returns estimation results in e() so the equivalent of return list is ereturn list. You will have to calculate the p-value manually.
      Code:
      . sysuse auto, clear
      (1978 Automobile Data)
      
      . anova price foreign
      
                               Number of obs =         74    R-squared     =  0.0024
                               Root MSE      =    2966.38    Adj R-squared = -0.0115
      
                        Source | Partial SS         df         MS        F    Prob>F
                    -----------+----------------------------------------------------
                         Model |  1507382.7          1   1507382.7      0.17  0.6802
                               |
                       foreign |  1507382.7          1   1507382.7      0.17  0.6802
                               |
                      Residual |  6.336e+08         72   8799416.9  
                    -----------+----------------------------------------------------
                         Total |  6.351e+08         73     8699526  
      
      . display e(df_m)
      1
      
      . display e(df_r)
      72
      
      . display e(F)
      .17130484
      
      . display Ftail(e(df_m),e(df_r),e(F))
      .68018509

      Comment


      • #4
        I just wish to comment on two aspects of this issue, apart from the honest desire of getting a value with ultimate precision.

        With regards to quite small p-values (such as 0.0000) as it seems to be the case, I fear most journals (at least in health sciences) will demand just informing that p is < 0.001.


        Code:
        . use http://www.stata-press.com/data/r15/systolic.dta
        (Systolic Blood Pressure Data)
        
        . anova systolic drug
        
                                 Number of obs =         58    R-squared     =  0.3355
                                 Root MSE      =    10.7211    Adj R-squared =  0.2985
        
                          Source | Partial SS         df         MS        F    Prob>F
                      -----------+----------------------------------------------------
                           Model |  3133.2385          3   1044.4128      9.09  0.0001
                                 |
                            drug |  3133.2385          3   1044.4128      9.09  0.0001
                                 |
                        Residual |  6206.9167         54    114.9429  
                      -----------+----------------------------------------------------
                           Total |  9340.1552         57   163.86237  
        
        . display Ftail(e(df_m),e(df_r),e(F))
        .0000575
        Therefore, taking the example above, in spite of a more precise rendition of the p-value, the information for the journal would be the same. With due reason, I fear say.

        On the other side, when p-values are "high", so to speak, perhaps too much precision wouldn't provide extra insights as well:

        Code:
        . anova systolic disease
        
                                 Number of obs =         58    R-squared     =  0.0523
                                 Root MSE      =    12.6861    Adj R-squared =  0.0179
        
                          Source | Partial SS         df         MS        F    Prob>F
                      -----------+----------------------------------------------------
                           Model |  488.63938          2   244.31969      1.52  0.2282
                                 |
                         disease |  488.63938          2   244.31969      1.52  0.2282
                                 |
                        Residual |  8851.5158         55   160.93665  
                      -----------+----------------------------------------------------
                           Total |  9340.1552         57   163.86237  
        
        . display Ftail(e(df_m),e(df_r),e(F))
        .22816436
        In short, apart from issues related to a couple of fields (genetics being one of them), I'm afraid that the extra effort to provide a quite precise p-value, well, would risk giving too much value to the p-value.
        Best regards,

        Marcos

        Comment


        • #5
          Despite having provided sample code to accomplish what was desired, I did so somewhat reluctantly and on the whole agree with the analysis in post #4 by Marcos, which was better expressed than I could have done.

          Comment


          • #6
            Dear all,

            thanks very much for your helpful posts.
            I should say that I'm requesting this as I'm doing multiple ANOVAs so will need to see if p-values reach a corrected p-value significance level which is 6 decimal places.

            with this in mind, I tried the code that was suggested in a couple of the above messages -
            display Ftail(e(df_m),e(df_r),e(F)) however, I don't get a numerical output. Instead, I just get a dot on the line below. Any ideas where I'm going wrong???? ed

            Comment


            • #7
              In particular, William - could you please explain how you calculated your p-value manually??
              many thanks.

              Comment


              • #8
                Reading help anova we see the following among the much larger list of stored results:
                Code:
                Stored results
                
                    anova stores the following in e():
                
                    Scalars        
                      ...
                      e(df_m)             model degrees of freedom
                      ...
                      e(df_r)             residual degrees of freedom
                      ...
                      e(F)                F statistic
                      ...
                the values of which were displayed in my post #3.

                Reading help Ftail we see
                Code:
                    Ftail(df1,df2,f)
                       Description:  the reverse cumulative (upper tail or survivor) F distribution with df1 numerator and df2
                                     denominator degrees of freedom; 1 if f < 0
                So
                Code:
                Ftail(e(df_m),e(df_r),e(F))
                is the probability an F distribution with 1 degree of freedom in the numerator and 72 degrees of freedom in the denominator equals or exceeds 0.1713.

                With regard to the question about what you're doing wrong, did you issue the command you cited immediately after doing the anova? Subsequent commands may replace the contents of e(). If it happens again, issue the command
                Code:
                ereturn list
                to see what is in e().

                If that doesn't help, you should review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

                Section 12.1 is particularly pertinent

                12.1 What to say about your commands and your problem

                Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!
                Read about using CODE blocks to copy commands and output from your Results window and paste them into a CODE block in a Statalist post, as I did in post #3. It will be important to see your anova, ereturn, and display commands and their output.
                Last edited by William Lisowski; 02 Jul 2017, 10:24.

                Comment


                • #9
                  Thanks! Really helpful.

                  Comment

                  Working...
                  X