Proportion - individual p-values

Sebastian Rasmussen

Join Date: Dec 2019

Posts: 7
#1

Proportion - individual p-values

15 Dec 2019, 12:27

Dear all

I have a data set over local municipalities in Denmark and their individual tax rate. What I want to know is the distribution of the decimals, which will say 0-9. I want to know that because on the one hand i would expect a uniform 10% distribution because it should be random whether a local municipality for example needs a tax rate on 22,3 or 22,8 but on the other hand we know from marketing research the the decimals 0, 5 og 9 are overrepresented and if politicians are strategic - like retail owners - then the decimals 0, 5 og 9 will make up over 10% of the total amount of decimals.

Therefore I used the proportion command which gave me 95-percent confidence intervals but not any p-values. Is it possible to get individual p-values for every decimal to identify whether their fraction is significant different from the expected

Down below you see my code and outpout (without standard errors)

Code:

proportion decimal

Code:

decimal proportion logit[95% CI] 0 0,091 0,082-0,099 1 0,056 0,049-0,063 2 0,077 0,069-0,085 3 0,084 0,076-0,093 4 0,079 0,071-0,087 5 0,150 0,140-0,161 6 0,081 0,073-0,090 7 0,096 0,088-0,106 8 0,128 0,118-0,139 9 0,154 0,144-0,166

Thanks in advance
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#2

15 Dec 2019, 13:02

Here's one thing you could do:

Code:

tab decimal, gen(dec) // creates dec0, dec1, ..., dec9 proportion dec0-dec9, citype(agresti)

(Given the relatively small values of the proportions, I'd suggest some kind of "fancier" CI type, e.g. exact or agresti.)
Comment
Sebastian Rasmussen

Join Date: Dec 2019

Posts: 7
#3

15 Dec 2019, 13:36

Hey Mike

Thanks for your answer. First nice with the heads up about CI-types, i will definetely check that out. But unfortunately I don't think that your suggestion solves my issue. I tried the codes you wrote and they worked just fine, but I still don't get any p-values. So unless I am missing something I stille don't know the individual decimals significance level.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#4

15 Dec 2019, 21:00

Sorry, I made a sloppy read of your post. You could do this:

Code:

tab decimal, gen(dec) forval i = 0/9 { bitest dec`i' == 0.10, detail }
Comment
Sebastian Rasmussen

Join Date: Dec 2019

Posts: 7
#5

16 Dec 2019, 12:31

Your code gives p-values - thanks a lot Mike
Comment
Sebastian Rasmussen

Join Date: Dec 2019

Posts: 7
#6

16 Dec 2019, 13:24

I am sorry Mike but I have a question more for you. When I make the proportion test then the CI for the de decimal 9 is 0,144-0,166, but when I make the bitest, then the p-value for the decimal 9 is 0,155. I can't make sense out of that, because according to the bitest the decimal 9 is not significantly different from 0,10, but according to the proportion test the CI does not overlap with 0,10. Is it just me or is that contradictory?
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#7

17 Dec 2019, 07:22

Tests and CIs are not quite the same thing. Tests work the the sampling distribution presumed to exist if the population fits with some null hypothesis is true, while confidence intervals presume a sampling distribution presuming a population based on the observed data. So, if the sample value of a proportion is 0.30 and you test it against a population value of 0.10, the p-value will be derived presumed the population proportion truly is 0.10, while the confidence interval will use the value of 0.30 to characterize the population. The formula-based standard errors would be sqrt(0.1*0.9/N) in the former case but sqrt(0.3*0.7/N) in the latter case.

A further issue is that -bitest- uses an exact test, using the binomial distribution, while -proportion- does not use this unless you specify the -exact- option.
Comment
Sebastian Rasmussen

Join Date: Dec 2019

Posts: 7
#8

18 Dec 2019, 12:10

Thanks a lot again Mike for clarifying that. Is there a reason you suggest bitest and not prtest? I think I once heard a general rule which stated that that bitest was most sufficient, when observations are less than 30 and prtest if observations are over 30, but I am not really sure about that.
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#9

18 Dec 2019, 13:59

My personal preference is always to prefer so-called "exact" procedures, since 21st century algorithms and computing equipment make them easy to do. But no, the over/under 30 is not a good rule for when the exact procedure might not make much difference. The better rules sometimes prescribe something like Pi*N > 15 and (1-Pi*N) > 15, where Pi is the *hypothesized* value for the population proportion.
Comment
Sebastian Rasmussen

Join Date: Dec 2019

Posts: 7
#10

24 Dec 2019, 07:00

Thanks a lot for the help Mike. You have been very helpful
Comment
Sebastian Rasmussen

Join Date: Dec 2019

Posts: 7
#11

28 Dec 2019, 10:38

Hi Mike my data goes from 1988-2002, do you know if there is a way to use clustered standard errors for proportion test (bitest/prtest/?).
Comment

Announcement

Proportion - individual p-values

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment