weights application on chi square test

Lina Badawy

Join Date: Dec 2018

Posts: 7
#1

weights application on chi square test

28 Jul 2019, 04:33

Hello,

I have a large regional dataset with a weight variable ready. I am trying to conduct a chi-square test that would be weighted by the weight variable, but I can't seem to get it right.

The command I normally use for chi-square is the following: tab fcg country, exp chi2 cchi2.

When I tried adding [aweight = weight], it did not work. Any suggestions?

Thanks,
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

28 Jul 2019, 09:49

Saying something "did not work" is not helpful. There are many ways in which a command might not work--you need to explain, or better still, show by pasting Stata's output and error messages into your post, exactly what went wrong.

That said, I can tell you that the chi square calculation simply does not allow aweights. What is your weighting variable? What does it mean? If it is a country size variable, then probably you need to use it as an fweight. The chi square calculation does allow fweights, but no other weights.
Comment
Lina Badawy

Join Date: Dec 2018

Posts: 7
#3

29 Jul 2019, 00:59

Thank you Clyde. The problem is that the weights have been calculated individually for each country based on different criteria and sampling designs, so I don't want to re-calculate a weight, but rather just apply that weight variable as is.

The output I get is "weights not allowed" or "chi2 not allowed with aweights".

I tried fweight, instead of aweight, but it still gave me an error message "weights not allowed" or "may not use noninteger frequency weights".

So is there another command I can use to apply chi square while applying that weight variable without any manipulation by stata?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#4

29 Jul 2019, 02:19

Let's approach this from the opposite direction. tabulate, chi2 expects as input integer observed frequencies, as otherwise the test concerned, covered in many introductory courses, makes no sense. That is why only frequency weights are supported. It's not a quirk or arbitrary limitation of tabulate; it's standard statistical logic.

If you can calculate or at the very least approximate integer observed frequencies then other commands can be used to get a chi-square test. Here for example is tabchi from package tab_chi (SSC). Here we imagine a known population size and percent breakdown, so that we can approximate integer frequencies. The capture noisily shows the error message but lets the script continue.

Code:

clear input A B percent 1 1 1.8 1 2 2.2 2 1 47.8 2 2 48.2 end gen observed = 9876 * percent/100 capture noisily tabchi A B [fw=observed] gen fudged = round(observed, 1) tabchi A B [fw=fudged] observed frequency expected frequency ------------------------------ | B A | 1 2 ----------+------------------- 1 | 178 217 | 195.940 199.060 | 2 | 4721 4760 | 4703.060 4777.940 ------------------------------ Pearson chi2(1) = 3.3952 Pr = 0.065 likelihood-ratio chi2(1) = 3.4013 Pr = 0.065

That said, your summary

the weights have been calculated individually for each country based on different criteria and sampling designs

does not convey to me that a standard chi-square test makes any sense without a great deal of arm-waving and secondary argument.
Comment
Lina Badawy

Join Date: Dec 2018

Posts: 7
#5

29 Jul 2019, 02:43

Thank you Nick. So if I understand you correctly, applying chi-square tests on such a dataset does not make sense because weights have been calculated differently and there is no common sampling design between the different countries, correct?

Does that also apply for ANOVA? and regressions?

Thanks,
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#6

29 Jul 2019, 03:06

Again, I would turn the question round. Fill in X and Y below:

If weights are calculated differently in different parts of the dataset then it still makes sense to do X because Y.

There may be a defence in which the word approximately figures. I don't think any reader can tell you what the defence is without more information.
Comment
titas chowdhury

Join Date: Mar 2022

Posts: 1
#7

29 Mar 2022, 21:37

I want to do chi square test on a large sample using sample weights.But stata is not allowing for aweights or pweights.Is there any other alternative
Comment

Announcement

weights application on chi square test

Comment

Comment

Comment

Comment

Comment

Comment