chi2 test

Tom Salvitti

Join Date: Apr 2020

Posts: 132
#1

chi2 test

13 Jan 2022, 00:47

Good morning to everybody,
I 've a dubt
Question1 MALE FEMALE OTHER TOT

1 64 119 7 190

2 28 92 5 125

3 9 12 1 22

4 5 12 1 18

5 4 5 1 10

6 8 12 1 21

7 26 56 1 83

is correct, to detect if there is a gender difference the following code?

tab Question1 gender,col chi2

Thanks a million

Tommaso
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3456
#2

13 Jan 2022, 01:41

What is exactly your doubt?
With column percentages (the option col) you can more easily compare genders. You can see that males are more likely to be in category 1 while women more likely in category 2. Otherwise they are all fairly similar. Now it is up to you to decide whether that is a substantively meaningful difference.

Code:

| female q01 | male female other | Total -----------+---------------------------------+---------- 1 | 44.44 38.64 41.18 | 40.51 2 | 19.44 29.87 29.41 | 26.65 3 | 6.25 3.90 5.88 | 4.69 4 | 3.47 3.90 5.88 | 3.84 5 | 2.78 1.62 5.88 | 2.13 6 | 5.56 3.90 5.88 | 4.48 7 | 18.06 18.18 5.88 | 17.70 -----------+---------------------------------+---------- Total | 100.00 100.00 100.00 | 100.00

The chi squared test test the hypothesis that the association you see in the table is only due to the margins. If the null is true then there is no association between men and women. However, a significant it does not tell you anything about what the difference between the genders is. So that is of limited value.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Tom Salvitti

Join Date: Apr 2020

Posts: 132
#3

13 Jan 2022, 01:54

Thanks a lot.

my doubt is to see if there is a significant difference between the gender as far as question q01 is concerned. Is the written INSTRUCTION " tab Question1 gender,col chi2" correct ?

The output is p = 0.610. No differences. Is it correct isn't?
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

13 Jan 2022, 02:37

If I understand correctly, your data are equivalent to this data example.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte question1 str6 gender int count
1 "female" 119
1 "male"    64
1 "other"    7
2 "female"  92
2 "male"    28
2 "other"    5
3 "female"  12
3 "male"     9
3 "other"    1
4 "female"  12
4 "male"     5
4 "other"    1
5 "female"   5
5 "male"     4
5 "other"    1
6 "female"  12
6 "male"     8
6 "other"    1
7 "female"  56
7 "male"    26
7 "other"    1
end

Using tabchi from tab_chi on SSC to get a little more detail, I add the Pearson residuals (observed MINUS expected) / root of expected (so that chi-square statistic = SUM of (Pearson residual)^2.

I can't see any more structure, as residuals of magnitude 1 are par for the course.

Code:

.  tabchi gender q [fw=count] , pearson

          observed frequency
          expected frequency
          Pearson residual

-------------------------------------------------------------------------
          |                           Question1                          
   gender |       1        2        3        4        5        6        7
----------+--------------------------------------------------------------
   female |     119       92       12       12        5       12       56
          | 124.776   82.090   14.448   11.821    6.567   13.791   54.507
          |  -0.517    1.094   -0.644    0.052   -0.612   -0.482    0.202
          | 
     male |      64       28        9        5        4        8       26
          |  58.337   38.380    6.755    5.527    3.070    6.448   25.484
          |   0.741   -1.675    0.864   -0.224    0.531    0.611    0.102
          | 
    other |       7        5        1        1        1        1        1
          |   6.887    4.531    0.797    0.652    0.362    0.761    3.009
          |   0.043    0.220    0.227    0.430    1.059    0.274   -1.158
-------------------------------------------------------------------------

7 cells with expected frequency < 5
4 cells with expected frequency < 1

          Pearson chi2(12) =  10.1718   Pr = 0.601
 likelihood-ratio chi2(12) =  10.4244   Pr = 0.579

I can't say why you get P = 0.610.

What are the questions to which male, female, or other are the answers? Why are the column totals above utterly different?

Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17711

13 Jan 2022, 02:54

Tommaso,
if answers to Question1 are ranked according to some preference/relevance order, shamelessly ealborating on Nick's helpful reply, you may want to consider -ologit- s an alterative approach:

Code:

. ologit question1 i.num_gender [fw=count]

Iteration 0:   log likelihood = -710.40469 
Iteration 1:   log likelihood = -710.25221 
Iteration 2:   log likelihood = -710.25219 

Ordered logistic regression                             Number of obs =    469
                                                        LR chi2(2)    =   0.30
                                                        Prob > chi2   = 0.8586
Log likelihood = -710.25219                             Pseudo R2     = 0.0002

------------------------------------------------------------------------------
   question1 | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
  num_gender |
       male  |  -.0682336    .185928    -0.37   0.714    -.4326457    .2961785
      other  |  -.1996942    .439131    -0.45   0.649    -1.060375    .6609867
-------------+----------------------------------------------------------------
       /cut1 |  -.4119686    .111094                     -.6297089   -.1942282
       /cut2 |   .6877913   .1145797                      .4632193    .9123634
       /cut3 |   .9092327   .1185879                      .6768046    1.141661
       /cut4 |   1.107749   .1230685                      .8665393    1.348959
       /cut5 |    1.22731   .1262104                      .9799423    1.474678
       /cut6 |    1.50894   .1349567                       1.24443     1.77345
------------------------------------------------------------------------------

.

that confirms no gender-related statistical significant effect.

Kind regards,
Carlo
(Stata 19.0)

Comment

Tom Salvitti

Join Date: Apr 2020

Posts: 132
#6

13 Jan 2022, 03:20

thanks to everybody.
Nick Cox IT WAS MY MISTAKE...P=0,601...
Comment

Question1	MALE	FEMALE	OTHER	TOT
1	64	119	7	190
2	28	92	5	125
3	9	12	1	22
4	5	12	1	18
5	4	5	1	10
6	8	12	1	21
7	26	56	1	83

Announcement

Comment

Comment

Comment

Comment

Comment