Simple Ttest question about Stata wording

Sarah Quartz

Join Date: Apr 2018

Posts: 3
#1

Simple Ttest question about Stata wording

14 Apr 2018, 06:59

What does this mean in Stata: Pr(T > t) = 1.0000 while Pr(T < t) = 0.0000 and Pr(|T| > |t|) = 0.0000?
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#2

14 Apr 2018, 07:04

Sarah:
the first two results refer to one-tailed ttest (right and left tail, respectively); the third result refers to a two-tailed ttest.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

14 Apr 2018, 07:11

Welcome to Statalist.

Suppose T is a random variable with a t distribution with some number of degrees of freedom, and the observed t is drawn from that distribution. Then suppose we observe t=-5.

Code:

pr(T>t) = pr(T>-5) = 1 pr(T<t) = pr(T<-5) = 0 pr(|T|>|t|) = Pr(|T|>5) = Pr(T<-5)+pr(T>+5) = 0+0

Added in edit: crossed with Carlo's answer, which probably better understood what your question meant.

In the future, it would be helpful to show, in your question, the Stata output to which you refer. Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

The more you help others understand your problem, the more likely others are to be able to help you solve your problem.
1 like
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35449

14 Apr 2018, 07:53

If one P-value is (reported as) 1.0000 and the other as 0.0000

(*) You are just testing a massive difference. Perhaps you should check whether your test was needed (are you comparing mice and elephants), or there was some other silly error.

(**) Stata is nevertheless just rounding to 4 dp.

Here I set up that two groups have in practice (although not in principle) disjoint distributions:

Code:

. clear

. set obs 100
number of observations (_N) was 0, now 100

. set seed 2803

. gen group = _n > 50

. gen y = cond(group == 1, rnormal(100, 1), rnormal(200, 1))

.
. ttest y, by(group)

Two-sample t test with equal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       0 |      50    199.7888    .1167141    .8252933    199.5542    200.0233
       1 |      50    100.1669    .1508038    1.066344    99.86383    100.4699
---------+--------------------------------------------------------------------
combined |     100    149.9778    5.007086    50.07086    140.0427     159.913
---------+--------------------------------------------------------------------
    diff |            99.62188    .1906934                99.24345    100.0003
------------------------------------------------------------------------------
    diff = mean(0) - mean(1)                                      t = 522.4192
Ho: diff = 0                                     degrees of freedom =       98

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 1.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 0.0000

. ret li

scalars:
              r(level) =  95
                 r(sd) =  50.07086364561855
               r(sd_2) =  1.066343638889303
               r(sd_1) =  .8252933085635524
                 r(se) =  .1906933560121829
                r(p_u) =  6.3215874520e-171
                r(p_l) =  1
                  r(p) =  1.2643174904e-170
                  r(t) =  522.4192300351496
               r(df_t) =  98
               r(mu_2) =  100.1668844604492
                r(N_2) =  50
               r(mu_1) =  199.7887606811524
                r(N_1) =  50

Comment

Sarah Quartz

Join Date: Apr 2018
Posts: 3

14 Apr 2018, 08:43

Originally posted by Nick Cox View Post

Code:

. clear

. set obs 100
number of observations (_N) was 0, now 100

. set seed 2803

. gen group = _n > 50

. gen y = cond(group == 1, rnormal(100, 1), rnormal(200, 1))

.
. ttest y, by(group)

Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
0 | 50 199.7888 .1167141 .8252933 199.5542 200.0233
1 | 50 100.1669 .1508038 1.066344 99.86383 100.4699
---------+--------------------------------------------------------------------
combined | 100 149.9778 5.007086 50.07086 140.0427 159.913
---------+--------------------------------------------------------------------
diff | 99.62188 .1906934 99.24345 100.0003
------------------------------------------------------------------------------
diff = mean(0) - mean(1) t = 522.4192
Ho: diff = 0 degrees of freedom = 98

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000

. ret li

scalars:
r(level) = 95
r(sd) = 50.07086364561855
r(sd_2) = 1.066343638889303
r(sd_1) = .8252933085635524
r(se) = .1906933560121829
r(p_u) = 6.3215874520e-171
r(p_l) = 1
r(p) = 1.2643174904e-170
r(t) = 522.4192300351496
r(df_t) = 98
r(mu_2) = 100.1668844604492
r(N_2) = 50
r(mu_1) = 199.7887606811524
r(N_1) = 50

Thank you for the example and your answer! So if my test gives me the result of Pr(T > t) = 1.0000, with a t value of -4.8 and a means difference of -0.08, do I reject the null hypothesis or not?

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35449
#6

14 Apr 2018, 08:46

You typically should focus on

Pr(|T| > |t|) = 0.0000

and yes, that means rejecting the null hypothesis of equal means. This is, or should be, covered in most introductory statistics texts.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#7

14 Apr 2018, 10:01

Let me expand on Nick's answer.

There are three common tests of hypothesis about the difference in means. All three have the same null hypothesis: the difference is zero. The three alternatives are that the difference is less than zero, the difference is not equal to zero, and the difference is greater than zero.

Before doing a t-test you should decide what alternative is appropriate in your situation. That often is that the difference is not zero, but that need not be the case. For each alternative, the p-value is calculated using a different formula.

Rather than require you to tell it what alternative you have chosen, the ttest command reports the p-values for all three alternatives. For the alternative that the difference is not zero Ha: diff != 0 the p-value is calculated as Pr(|T| > |t|) . Look at the ttest output in post #4 to see all three alternatives, the formulas used to calculate the p-values corresponding to those alternatives, and the calculated p-values themselves.

So your question

if my test gives me the result of Pr(T > t) = 1.0000, with a t value of -4.8 and a means difference of -0.08, do I reject the null hypothesis or not

cannot be answered without more information. It is incompletely stated because (a) you do not tell us what alternative hypothesis you want for your t-test and (b) you do not tell us the level of significance that you want for the test.

The formula you have given is the one for the alternative that the difference between the means is greater than zero. If you have reason to believe that the difference between the means can only be positive, this would be appropriate. In this case, the level of significance does not matter, because the calculated p-value is 1, which is not less than, say, .05 corresponding to a 5% level of significance, nor less than any commonly-used level of significance. So you would never reject the null hypothesis in favor of the chosen alternative. This should not surprise you: you are testing whether the difference in means is positive, yet the observed difference in the averages was negative, which does not support the idea that the difference of the means is positive.

But if the difference between the means could go either way, you should be looking at Ha: diff != 0 and Pr(|T| > |t|) and the calculated p-value, comparing the calculated p-value to the value corresponding to your desired level of significance. This is the most commonly used alternative hypothesis, which was the thrust of Nick's response in post #6.

As Nick suggests, understanding the results of a t-test is a matter of introductory statistics more than of Stata.

Last edited by William Lisowski; 14 Apr 2018, 10:12.
2 likes
Comment

Announcement