Mean differences test

Prathvajeeth Rajmohan

Join Date: Aug 2017
Posts: 70

Mean differences test

06 Sep 2017, 14:21

Hi there I am trying to perform a means difference test given that seperates the 2 samples by a dividend dummy where =0 pays no dividends, while 1 = pays dividends :

Code:

. ttest mtb if inlist(year,2015,2016), by div

Two-sample t test with equal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       0 |     332    3.577726    .4284058    6.525282    1.733644    3.421808
       1 |     467    2.392845    .0599667    1.295892    1.775007    2.010684
---------+--------------------------------------------------------------------
combined |     699    2.120159    .1480283    3.913662    1.829525    2.410793
---------+--------------------------------------------------------------------
    diff |            1.184881     .3135083                .0693467    1.300415
------------------------------------------------------------------------------
    diff = mean(0) - mean(1)                                      t =   2.1846
Ho: diff = 0                                     degrees of freedom =      697

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.9854         Pr(|T| > |t|) = 0.0293          Pr(T > t) = 0.0146

I was wondering on Stata is there an option to do this test (both the equal variance of 2 subsamples and unequal versions of test) but with the mean of the "1" group - mean of "0" group as opposed to how it is now which is: mean(0)-mean(1).

Thanks in advance.

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

06 Sep 2017, 14:24

Well, the results are exactly the same either way, except for changing the sign of diff and the t-statistic.

If there is some reason why it is important to have the output displayed with the roles of 0 and 1 interchanged, you could get that by recoding your grouping variable and re-doing the t-test with that.
Comment
Prathvajeeth Rajmohan

Join Date: Aug 2017

Posts: 70
#3

07 Sep 2017, 06:57

Originally posted by Clyde Schechter View Post

Well, the results are exactly the same either way, except for changing the sign of diff and the t-statistic.

If there is some reason why it is important to have the output displayed with the roles of 0 and 1 interchanged, you could get that by recoding your grouping variable and re-doing the t-test with that.

brilliant thanks so much, I figured it was that but wasnt sure whether to ask, so all that would change is:

difference= - 1.184881
t = - 2.1846

most importantly: will the p values remain the same though? Thanks again!
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17673

07 Sep 2017, 07:32

Prathvajeeth:
yes, as you can see from the following example, that elaborates a bit on Clyde's helpful advice:

Code:

. use "C:\Program Files (x86)\Stata14\ado\base\a\auto.dta", clear
(1978 Automobile Data)

. gen foreign_2=0 if foreign==1
(52 missing values generated)

. replace foreign_2=1 if foreign==0
(52 real changes made)

. tab foreign

   Car type |      Freq.     Percent        Cum.
------------+-----------------------------------
   Domestic |         52       70.27       70.27
    Foreign |         22       29.73      100.00
------------+-----------------------------------
      Total |         74      100.00

. tab foreign_2

  foreign_2 |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |         22       29.73       29.73
          1 |         52       70.27      100.00
------------+-----------------------------------
      Total |         74      100.00

. label define foreign_2 0"Foreign" 1 "Domestic"

. label val foreign_2 foreign_2

. ttest price, by(foreign) unequal

Two-sample t test with unequal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
Domestic |      52    6072.423    429.4911    3097.104    5210.184    6934.662
 Foreign |      22    6384.682    558.9942    2621.915     5222.19    7547.174
---------+--------------------------------------------------------------------
combined |      74    6165.257    342.8719    2949.496    5481.914      6848.6
---------+--------------------------------------------------------------------
    diff |           -312.2587    704.9376               -1730.856    1106.339
------------------------------------------------------------------------------
    diff = mean(Domestic) - mean(Foreign)                         t =  -0.4430
Ho: diff = 0                     Satterthwaite's degrees of freedom =  46.4471

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.3299         Pr(|T| > |t|) = 0.6599          Pr(T > t) = 0.6701

. ttest price, by(foreign_2) unequal

Two-sample t test with unequal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
 Foreign |      22    6384.682    558.9942    2621.915     5222.19    7547.174
Domestic |      52    6072.423    429.4911    3097.104    5210.184    6934.662
---------+--------------------------------------------------------------------
combined |      74    6165.257    342.8719    2949.496    5481.914      6848.6
---------+--------------------------------------------------------------------
    diff |            312.2587    704.9376               -1106.339    1730.856
------------------------------------------------------------------------------
    diff = mean(Foreign) - mean(Domestic)                         t =   0.4430
Ho: diff = 0                     Satterthwaite's degrees of freedom =  46.4471

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.6701         Pr(|T| > |t|) = 0.6599          Pr(T > t) = 0.3299

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Prathvajeeth Rajmohan

Join Date: Aug 2017
Posts: 70

16 Sep 2017, 15:58

Originally posted by Carlo Lazzaro View Post

Prathvajeeth:
yes, as you can see from the following example, that elaborates a bit on Clyde's helpful advice:

Hi there Carlo, thanks so much, could I ask there are 2 versions of the test (unequal and equal) I was wondering which one is the right one to use to test the differences of means?

- I know there is a command called sdtest but was wondering if it was accurate. Thanks

For example here:

Code:

. sdtest tot_ass if inlist(year,2015,2016), by(FXDerivatives10)

Variance ratio test
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       0 |     247    1162.454    129.4991    2035.238    907.3859    1417.523
       1 |     481    8059.183    1179.877    25876.72    5740.821    10377.55
---------+--------------------------------------------------------------------
combined |     728    5719.222    789.8602    21311.59    4168.542    7269.901
------------------------------------------------------------------------------
    ratio = sd(0) / sd(1)                                         f =   0.0062
Ho: ratio = 1                                    degrees of freedom = 246, 480

    Ha: ratio < 1               Ha: ratio != 1                 Ha: ratio > 1
  Pr(F < f) = 0.0000         2*Pr(F < f) = 0.0000           Pr(F > f) = 1.0000

am I right in saying I reject the null? as the middle p-value is 0. so the variance between the 2 groups is not the same?

Code:

           Ha: ratio > 1
      2*Pr(F < f) = 0.0000

Does the 2* before the p value make the interpretation any different or is it still the same? Thanks

Last edited by Prathvajeeth Rajmohan; 16 Sep 2017, 16:03.

Comment

Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

16 Sep 2017, 16:28

You may wish to read this thread: https://www.statalist.org/forums/for...sdtest-results

Best regards,

Marcos
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#7

17 Sep 2017, 00:37

I would go -unequal-, unless I have sound information that the population from which the two samples were drawn have equal variance, which, as per -sdtest- outcome, is not your case.

Kind regards,
Carlo
(Stata 19.0)
Comment
Prathvajeeth Rajmohan

Join Date: Aug 2017

Posts: 70
#8

17 Sep 2017, 06:11

Originally posted by Carlo Lazzaro View Post

I would go -unequal-, unless I have sound information that the population from which the two samples were drawn have equal variance, which, as per -sdtest- outcome, is not your case.

Thanks so much, when you mean sound knowledge do you mean besides the sdtest, ie do I just not put too much weight on the sdtest and go with the -ttest- unequal for all?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#9

17 Sep 2017, 06:14

Prathvajeeth:
that's exactly what I meant.
I consider different variances as the rule (ie, I go -unequal), unless otherwise informed about the populations from which the rsamples were drawn.

Kind regards,
Carlo
(Stata 19.0)
Comment
Prathvajeeth Rajmohan

Join Date: Aug 2017

Posts: 70
#10

17 Sep 2017, 13:30

Originally posted by Carlo Lazzaro View Post

Prathvajeeth:
that's exactly what I meant.
I consider different variances as the rule (ie, I go -unequal), unless otherwise informed about the populations from which the rsamples were drawn.

Hi the 2 different samples were drawn from the same population .ie I have a total sample/dataset and I just use stata to spit the the different variables (assets, market-to-book...etc) into 2 group 1 that used derivatives and others that doesnt.

Do this change anything or do I still use the unequal version?

Thanks
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#11

18 Sep 2017, 00:03

Prathvajeeth:
I would still go -unequal-.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Prathvajeeth Rajmohan

Join Date: Aug 2017

Posts: 70
#12

19 Sep 2017, 18:20

Originally posted by Carlo Lazzaro View Post

Prathvajeeth:
I would still go -unequal-.

Hi Carlo one quick question I also wanted to ask with regards to the means difference test. In my regression model I use total assets as a proxy for firm size, where normally in literature they use the natural log of the assets to proxy for firm size, which is what I do when I estimate my models.

When I report the variables I used in teh regression in the summary statistics I have reported the unlogged version of assets

-my question is in the means difference test (comparing all the variables in the summary stats from one group derivative users to non-users) do I perform the test on the logged version of the variable ie ln assets or the unlogged version. Thanks
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#13

20 Sep 2017, 00:08

Prathvajeeth:
I would go unlogged.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Mean differences test

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment