Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Independent Group T Test when more than two groups are there

    Hi All,

    I am want to compare mean of a variable by subcategorizing them into more than two groups. From the following Example

    Code:
     
     use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear
    I can get two sample t test with equal variances as follow:

    Code:
     
     ttest write, by(female)
    I was able to get following output
    Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- male | 91 50.12088 1.080274 10.30516 47.97473 52.26703 female | 109 54.99083 .7790686 8.133715 53.44658 56.53507 ---------+-------------------------------------------------------------------- combined | 200 52.775 .6702372 9.478586 51.45332 54.09668 ---------+-------------------------------------------------------------------- diff | -4.869947 1.304191 -7.441835 -2.298059 ------------------------------------------------------------------------------ diff = mean(male) - mean(female) t = -3.7341 Ho: diff = 0 degrees of freedom = 198 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0001 Pr(|T| > |t|) = 0.0002 Pr(T > t) = 0.9999 but in the same example I have a group ses (social economic status)

    offcourse if i write following command

    Code:
     
     ttest write, by(ses)
    It wont work as ses has more than two category high middle low and stata is giving the same message

    . ttest write, by(ses)
    more than 2 groups found, only 2 allowed

    .

    What I want to know the name of test or command which do the above output for more than two group. .i.e. t test for ses in the given example.

    Regards and Stay Blessed

    Muhammad Mubeen

  • #2
    Analysis of variance (which you can also approach via regression)

    Comment


    • #3
      I tried Anova
      by following command!

      oneway write ses

      Analysis of Variance
      Source SS df MS F Prob > F
      ------------------------------------------------------------------------
      Between groups 858.715441 2 429.35772 4.97 0.0078
      Within groups 17020.1596 197 86.396749
      ------------------------------------------------------------------------
      Total 17878.875 199 89.843593


      but output is not as per my requirement, I want the output in the format as ttest two group comparison mean is given

      see following example

      . ttest write, by(female)

      Two-sample t test with equal variances
      ------------------------------------------------------------------------------
      Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
      ---------+--------------------------------------------------------------------
      male | 91 50.12088 1.080274 10.30516 47.97473 52.26703
      female | 109 54.99083 .7790686 8.133715 53.44658 56.53507
      ---------+--------------------------------------------------------------------
      combined | 200 52.775 .6702372 9.478586 51.45332 54.09668
      ---------+--------------------------------------------------------------------
      diff | -4.869947 1.304191 -7.441835 -2.298059
      ------------------------------------------------------------------------------
      diff = mean(male) - mean(female) t = -3.7341
      Ho: diff = 0 degrees of freedom = 198

      Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
      Pr(T < t) = 0.0001 Pr(|T| > |t|) = 0.0002 Pr(T > t) = 0.9999

      but when i apply the following

      . ttest write, by(ses)
      more than 2 groups found, only 2 allowed
      r(420);

      I have tried multiple anova but all Anova's output is completely different than this.

      Comment


      • #4
        As Nick suggests, you can perform an ANOVA test in this case. You can use the oneway command with the bonferroni option, which will give you a comparison matrix of each category on the grouping variable:
        Code:
        use "http://www.ats.ucla.edu/stat/stata/notes/hsb2", clear
        oneway write ses, bonferroni
        The output looks like this:
        Code:
        . oneway write ses, bonferroni
        
                                Analysis of Variance
            Source              SS         df      MS            F     Prob > F
        ------------------------------------------------------------------------
        Between groups      858.715441      2    429.35772      4.97     0.0078
         Within groups      17020.1596    197    86.396749
        ------------------------------------------------------------------------
            Total            17878.875    199    89.843593
        
        Bartlett's test for equal variances:  chi2(2) =   0.1462  Prob>chi2 = 0.930
        
                              Comparison of writing score by ses
                                        (Bonferroni)
        Row Mean-|
        Col Mean |        low     middle
        ---------+----------------------
          middle |    1.30929
                 |      1.000
                 |
            high |    5.29677    3.98748
                 |      0.012      0.032
        The p-value of 0.0078 tells you that there are statistical significant differences in the writing score between the three groups, at the 99.22%-level.

        Comment


        • #5
          Muhammad:
          a more time-consuming way to get what you are after implies performing a series of -ttest- dividing the arbitrary p<0.05 by the number of comparison you're going to do (3 in your case) beforehand. Hence, in order to reject the null with a probability of Type 1 Error set at 0.05, the resulting p-value should be less than (0.05/3)=.001666667:
          Code:
          . use "http://www.ats.ucla.edu/stat/stata/notes/hsb2", clear
          . ttest write if ses!=1, by(ses) unequal
          
          Two-sample t test with unequal variances
          ------------------------------------------------------------------------------
             Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
          ---------+--------------------------------------------------------------------
            middle |      95    51.92632    .9342604    9.106044    50.07132    53.78131
              high |      58    55.91379     1.23991    9.442874    53.43092    58.39667
          ---------+--------------------------------------------------------------------
          combined |     153    53.43791    .7604806    9.406626    51.93543    54.94039
          ---------+--------------------------------------------------------------------
              diff |           -3.987477    1.552488               -7.062047   -.9129079
          ------------------------------------------------------------------------------
              diff = mean(middle) - mean(high)                              t =  -2.5684
          Ho: diff = 0                     Satterthwaite's degrees of freedom =   117.19
          
              Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
           Pr(T < t) = 0.0057         Pr(|T| > |t|) = 0.0115          Pr(T > t) = 0.9943
          
          . ttest write if ses!=2, by(ses) unequal
          
          Two-sample t test with unequal variances
          ------------------------------------------------------------------------------
             Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
          ---------+--------------------------------------------------------------------
               low |      47    50.61702    1.384316    9.490391    47.83054     53.4035
              high |      58    55.91379     1.23991    9.442874    53.43092    58.39667
          ---------+--------------------------------------------------------------------
          combined |     105    53.54286     .954748    9.783255    51.64956    55.43616
          ---------+--------------------------------------------------------------------
              diff |           -5.296772    1.858415               -8.984579   -1.608965
          ------------------------------------------------------------------------------
              diff = mean(low) - mean(high)                                 t =  -2.8502
          Ho: diff = 0                     Satterthwaite's degrees of freedom =  98.3367
          
              Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
           Pr(T < t) = 0.0027         Pr(|T| > |t|) = 0.0053          Pr(T > t) = 0.9973
          
          . ttest write if ses!=3, by(ses) unequal
          
          Two-sample t test with unequal variances
          ------------------------------------------------------------------------------
             Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
          ---------+--------------------------------------------------------------------
               low |      47    50.61702    1.384316    9.490391    47.83054     53.4035
            middle |      95    51.92632    .9342604    9.106044    50.07132    53.78131
          ---------+--------------------------------------------------------------------
          combined |     142    51.49296    .7738965    9.222041    49.96302     53.0229
          ---------+--------------------------------------------------------------------
              diff |           -1.309295    1.670082               -4.627988    2.009399
          ------------------------------------------------------------------------------
              diff = mean(low) - mean(middle)                               t =  -0.7840
          Ho: diff = 0                     Satterthwaite's degrees of freedom =  88.4657
          
              Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
           Pr(T < t) = 0.2176         Pr(|T| > |t|) = 0.4352          Pr(T > t) = 0.7824
          
          . oneway write ses, bonf
          
                                  Analysis of Variance
              Source              SS         df      MS            F     Prob > F
          ------------------------------------------------------------------------
          Between groups      858.715441      2    429.35772      4.97     0.0078
           Within groups      17020.1596    197    86.396749
          ------------------------------------------------------------------------
              Total            17878.875    199    89.843593
          
          Bartlett's test for equal variances:  chi2(2) =   0.1462  Prob>chi2 = 0.930
          
                                Comparison of writing score by ses
                                          (Bonferroni)
          Row Mean-|
          Col Mean |        low     middle
          ---------+----------------------
            middle |    1.30929
                   |      1.000
                   |
              high |    5.29677    3.98748
                   |      0.012      0.032
          That said, I would strongly support previous comments in favour of -oneway- and, even more, in favour of -regression-.
          Last edited by Carlo Lazzaro; 07 Nov 2016, 08:08.
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            Note that in this instance, and many others, anova can't tell the whole story as it takes no account of the fact that ses is ordinal.

            For those curious, a graph tells more than some of the inferential stuff:

            Code:
            use "http://www.ats.ucla.edu/stat/stata/notes/hsb2", clear
            capture ssc install stripplot 
            stripplot write, over(ses) cumul cumprob box centre vertical refline yla(, ang(h)) xla(, noticks) xsc(titlegap(*5))
            The boxes show medians and quartiles as customary. The added lines are the means.

            Click image for larger version

Name:	ses_write.png
Views:	1
Size:	27.6 KB
ID:	1363218

            Comment


            • #7
              Thanks Nick: very enlightening.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Thank you All,

                especially Carlo's way is giving me required output.

                However, I have some more small queries.

                Code:
                oneway write ses, bonferroni

                This code is giving output where it is stated that it is assuming Equal Variance among the Groups. What Test we should use if we have unequal variance among the groups.

                Moreover, This is just an example, In my Actual Research where I have to apply the same technique, my Groups were made ordinal by myself categorical whereas in actual they were continuous ranging from 0 to 1000.
                To do the same in the given example following code will give you idea



                Code:
                generate groups = recode(id, 25, 60, 100, 150, 200)


                Now my hypothesis is that Mean of group with value of 25 is less than Mean of group with Value of 60 is less than Mean of group with Value of 100 is less than Mean of Group = 150 is less than Mean of Group = 200.






                I am ready to write the following timeconsuming code

                Code:
                ttest write if groups!=100 | groups!=150 | groups !=200, by(groups) unequal  /// for diff of mean between 25 and 60 group
                 ttest write if groups!=25 | groups!=150 | groups !=200, by(groups) unequal  /// for diff of mean between 60 and 100 group
                 ttest write if groups!=25 | groups!=60 | groups !=200, by(groups) unequal  /// for diff of mean between 100 and 150 group
                 ttest write if groups!=25 | groups!=60 | groups !=100, by(groups) unequal  /// for diff of mean between 150 and 200 group
                above code is again not working and stating that there are more than two groups ( i know i am doing some mistake in applying if option)
                see following

                ttest write if groups!=100 | groups!=150 | groups !=200, by(groups) unequal
                more than 2 groups found, only 2 allowed


                but still i feel that there must be something which i may be missing! as a test for my above hypothsis i.e. mean(group1) < mean(group2) < mean(group3) < mean(group4) < mean(group5) , when variance is unequal. Is not there a specific test for this type of hypothesis which give the output similar to ttest two mean as above?


                In actual research
                Code:
                generate group = recode(varcont, 0, 10, 25, 100, 500, 1000)
                * varcont is my continuous variable whom i am making as ordinal to develop my hypothesis.



                Regards and Stay Blessed
                Muhammad Mubeen

                Comment


                • #9
                  Consider a Jonckheere-Terpstra test. It's not for means, but it may help.

                  Comment


                  • #10
                    Muhammad:
                    -anova is, in general, quite robust to departures from equal variance prerequiste;
                    - I find you query a bit too sparse. Anyway, if your depvar is continuous and you want to stick with -ttest-, you should -label- the -Groups- variable first, and then try, for each of the planned comparison, what follows (changing the Group_# when necessary):
                    Code:
                    ttest write if ses==Group_25 | ses==Group_60, by(ses) unequal]
                    However, as Nick pointed out, if Group has an ordinal flavour, something in your result will remain unsaid.

                    PS: crossed in the cyber-space with Nick's reply, who tackled the issue from a very different point.
                    Kind regards,
                    Carlo
                    (StataNow 18.5)

                    Comment


                    • #11
                      Thank You Again,

                      after comparing your given code with my already stated , I found out my mistake and following code worked

                      Code:
                      ttest write if groups==25 | groups==60, by(groups) unequal // Difference between group of 25 and group of 60
                      ttest write if groups==60 | groups==100, by(groups) unequal // Difference between group of 60 and group of 100
                      and so on


                      and also Jonckheere-Terpstra test was help too for atleast consideration in analysis.

                      Regards and Stay Blessed
                      Mubeen



                      Comment


                      • #12
                        I have a data of auditory perceptual assessment of three groups, can someone help me how to analyze it? Thank you.

                        Comment


                        • #13
                          Said:
                          please start a new thread following the FAQ recommendations on how to post effectively. Thanks.
                          Kind regards,
                          Carlo
                          (StataNow 18.5)

                          Comment

                          Working...
                          X