Interpreting welch test

Moomal Khan

Join Date: Jul 2022
Posts: 19

Interpreting welch test

05 Apr 2023, 23:52

Hi Evereyone, i need a little help here with interpretation of Welch Test
i used this command

Code:

ttest healthy, by(treatment) welch

I got following results:

Code:

 ttest healthy, by(treatment) welch

Two-sample t test with unequal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
       0 |      84    .5119048    .0548666    .5028604    .4027774    .6210322
       1 |   2,607    .7092443    .0088956    .4541981    .6918012    .7266875
---------+--------------------------------------------------------------------
Combined |   2,691    .7030844    .0088094     .456984    .6858106    .7203581
---------+--------------------------------------------------------------------
    diff |           -.1973396     .055583               -.3078075   -.0868717
------------------------------------------------------------------------------
    diff = mean(0) - mean(1)                                      t =  -3.5504
H0: diff = 0                             Welch's degrees of freedom =  87.5254

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0003         Pr(|T| > |t|) = 0.0006          Pr(T > t) = 0.9997

Now, as we are supposed to look at Ha: diff ! = 0 and its p-stat, we can see that it is 0.0006, if we convert it to %age, that becomes 6% or 0.06, implying that it is greater than 5% or 0.05. Can we now conclude that there is no significant difference between the means of control and treatment group?
please help me interpret the results

Tags: None

Moomal Khan

Join Date: Jul 2022

Posts: 19
#2

06 Apr 2023, 00:01

David Radwin please help
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#3

06 Apr 2023, 00:53

Maoomal:
I fear you're on the wrong track.
Your -ttest- results tells you that the difference in -healthy- (I assume that this is a continuous variable, otherwise the use of -ttest- would be hard to justify) between the two groups (that you assume to have equal variance, which is questionable on a priori basis; see -unequal- option in -ttest-) does reach statistical significance (0.006<0.05).
You can also see it from the limits of the 95% CI of the difference of the means, as both show the minus sign.
That said, my concern rests on the sample size(s): does it make any sense to compare N=84 vs N=2607 and consider inferential results as informative?

Kind regards,
Carlo
(Stata 19.0)
Comment
Moomal Khan

Join Date: Jul 2022

Posts: 19
#4

06 Apr 2023, 01:34

Carlo Lazzaro thankyou for the response. "healthy" is a binary variable where 0 is an unhealthy child with low birth weight and 1 is a healthy child with normal birth weight. are you telling me that i can't use this test if -healthy- is a binary variable and not continuous one?
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17673

06 Apr 2023, 01:54

Moonal:
this is exacly what I meant.
You should consider something like:

Code:

logit healthy i.treatment

or:

Code:

. prtesti 84 0.51 2607 .71

Two-sample test of proportions                     x: Number of obs =       84
                                                   y: Number of obs =     2607
------------------------------------------------------------------------------
             |       Mean   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
           x |        .51   .0545436                      .4030966    .6169034
           y |        .71   .0088871                      .6925817    .7274183
-------------+----------------------------------------------------------------
        diff |        -.2   .0552628                     -.3083131   -.0916869
             |  under H0:   .0506153    -3.95   0.000
------------------------------------------------------------------------------
        diff = prop(x) - prop(y)                                  z =  -3.9514
    H0: diff = 0

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(Z < z) = 0.0000         Pr(|Z| > |z|) = 0.0001          Pr(Z > z) = 1.0000

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Moomal Khan

Join Date: Jul 2022

Posts: 19
#6

06 Apr 2023, 03:27

okay. what is this test for and how do we interpret this?
basically what i wanted to do was that i had 84 observations in my control group and 2604 obs in my treatment group. my point of concern it that it might be problematic to have small number of observations in control group as compared to treatment group? so i did a little research to see how can i justify having small number of observations and see which test could help me in this regard. following this article https://www.statology.org/welchs-t-test-stata/ , i came across welch test and applied this.
now i am a little confused here, what should I do in this regard. could you please guide me a little more as to what should i do?
p.s. i am doing propensity score matching and with these set of observations i am getting significant results
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17673

06 Apr 2023, 05:09

Moomal:
the link you kindly shared points exactly to a comparison between the means of a continuous variable (-ttest-; -welsch- that, as I did not notice before, wisely assumes that the variance between the two populations from which the two samples were drawn have different variances).
In addition, the -welsch- option does not address the relevant sample sizes difference between your two groups
That said, I still won't get how -ttest- can be applied to the difference between two proportions, unless you trust tyat much the normal approximation to the binomial distribution.
This trust can well let you down, as in the following toy-example, where the upper bound on the 95% CI straddles the upper bound of the probability (that is, 1):

Code:

. prtesti 84 0.99 2607 .98

Two-sample test of proportions                     x: Number of obs =       84
                                                   y: Number of obs =     2607
------------------------------------------------------------------------------
             |       Mean   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
           x |        .99   .0108562                      .9687222    1.011278
           y |        .98   .0027419                      .9746259    .9853741
-------------+----------------------------------------------------------------
        diff |        .01   .0111971                     -.0119459    .0319459
             |  under H0:   .0154003     0.65   0.516
------------------------------------------------------------------------------
        diff = prop(x) - prop(y)                                  z =   0.6493
    H0: diff = 0

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(Z < z) = 0.7419         Pr(|Z| > |z|) = 0.5161          Pr(Z > z) = 0.2581

.

With a bit of guess-work, I could hypothesize that you are doing PSM to select, among the 2607 observations, those "similar" (controls?) to the 84 ones included in the other group.

Kind regards,
Carlo
(Stata 19.0)

Comment

Moomal Khan

Join Date: Jul 2022

Posts: 19
#8

11 Apr 2023, 01:19

Carlo Lazzaro here is a full scenario.
i am running a quasi-experimental research technique PSM. my outcome variable is child birth weight and my treatment variable is prenatal care. my covariates are mothers age, mothers education, child birth order etc . i have categorized the outcome variable into healthy and unhealthy. healthy =1 and unhealthy = 0. my treatment variable is also binary, it is 0 for mothers who did not receive prenatal care and one for those who received prenatal care. now the point of my concern is that i have 2000 observations in treatment group and only 80 obs in control group. i think lopsided sample might be an issue. however i am getting significant ATT, ATE & ATC for this sample set. still i want to perform a sensitivity test; i was suggested to bootstrap the treatment variable and see if i get similar results. You think this is a good idea?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#9

11 Apr 2023, 03:33

Moomal:
do you really mean bootstrapping a two-level categorical predictor or do you mean a random distribution of 1/0 within the sample?

Kind regards,
Carlo
(Stata 19.0)
Comment
Moomal Khan

Join Date: Jul 2022

Posts: 19
#10

12 Apr 2023, 01:33

I am not sure which will be more accurate in my case, what do do suggest?
Comment
Moomal Khan

Join Date: Jul 2022

Posts: 19
#11

12 Apr 2023, 01:36

I have another query, considering my scenario that I mentioned above, do I need to resample my treatment group or the control group? as i have 2000 obs in my treatment group and 84 in by control group.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17673

#12

12 Apr 2023, 05:11

Moomal:
1) & 2) I'd use runiform() to shuffle your main predictor and perform a sensitivity analyis to check the robustness of your baseline findings.
In the following toy-example, -shuffle- replaces -foreign- in sensitivity analysis:

Code:

. use "C:\Program Files\Stata17\ado\base\a\auto.dta"
(1978 automobile data)

. regress price i.foreign

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(1, 72)        =      0.17
       Model |  1507382.66         1  1507382.66   Prob > F        =    0.6802
    Residual |   633558013        72  8799416.85   R-squared       =    0.0024
-------------+----------------------------------   Adj R-squared   =   -0.0115
       Total |   635065396        73  8699525.97   Root MSE        =    2966.4

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     foreign |
    Foreign  |   312.2587   754.4488     0.41   0.680    -1191.708    1816.225
       _cons |   6072.423    411.363    14.76   0.000     5252.386     6892.46
------------------------------------------------------------------------------

. g shuffle=runiform()

. replace shuffle=0 if shuffle<0.5

. replace shuffle=1 if shuffle>=0.5

. regress price i.shuffle

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(1, 72)        =      0.12
       Model |  1056431.73         1  1056431.73   Prob > F        =    0.7301
    Residual |   634008964        72  8805680.06   R-squared       =    0.0017
-------------+----------------------------------   Adj R-squared   =   -0.0122
       Total |   635065396        73  8699525.97   Root MSE        =    2967.4

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   1.shuffle |  -239.0526   690.1671    -0.35   0.730    -1614.876     1136.77
       _cons |   6281.553   481.3818    13.05   0.000     5321.936     7241.17
------------------------------------------------------------------------------

.

What above also holds if you plan to perform a logistic regression.

Kind regards,
Carlo
(Stata 19.0)

Announcement

Interpreting welch test

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment