Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Proportion - individual p-values

    Dear all

    I have a data set over local municipalities in Denmark and their individual tax rate. What I want to know is the distribution of the decimals, which will say 0-9. I want to know that because on the one hand i would expect a uniform 10% distribution because it should be random whether a local municipality for example needs a tax rate on 22,3 or 22,8 but on the other hand we know from marketing research the the decimals 0, 5 og 9 are overrepresented and if politicians are strategic - like retail owners - then the decimals 0, 5 og 9 will make up over 10% of the total amount of decimals.

    Therefore I used the proportion command which gave me 95-percent confidence intervals but not any p-values. Is it possible to get individual p-values for every decimal to identify whether their fraction is significant different from the expected

    Down below you see my code and outpout (without standard errors)

    Code:
    proportion decimal
    Code:
    decimal proportion logit[95% CI]
                   0             0,091       0,082-0,099
                   1             0,056       0,049-0,063
                   2             0,077       0,069-0,085
                   3             0,084       0,076-0,093
                   4             0,079       0,071-0,087
                   5             0,150       0,140-0,161
                   6             0,081       0,073-0,090
                   7             0,096       0,088-0,106
                   8             0,128       0,118-0,139
                   9             0,154       0,144-0,166
    Thanks in advance

  • #2
    Here's one thing you could do:
    Code:
    tab decimal, gen(dec) // creates dec0, dec1, ..., dec9
    proportion dec0-dec9, citype(agresti)
    (Given the relatively small values of the proportions, I'd suggest some kind of "fancier" CI type, e.g. exact or agresti.)

    Comment


    • #3
      Hey Mike

      Thanks for your answer. First nice with the heads up about CI-types, i will definetely check that out. But unfortunately I don't think that your suggestion solves my issue. I tried the codes you wrote and they worked just fine, but I still don't get any p-values. So unless I am missing something I stille don't know the individual decimals significance level.

      Comment


      • #4
        Sorry, I made a sloppy read of your post. You could do this:
        Code:
        tab decimal, gen(dec)
        forval i = 0/9 {
           bitest dec`i' == 0.10, detail
        }

        Comment


        • #5
          Your code gives p-values - thanks a lot Mike

          Comment


          • #6
            I am sorry Mike but I have a question more for you. When I make the proportion test then the CI for the de decimal 9 is 0,144-0,166, but when I make the bitest, then the p-value for the decimal 9 is 0,155. I can't make sense out of that, because according to the bitest the decimal 9 is not significantly different from 0,10, but according to the proportion test the CI does not overlap with 0,10. Is it just me or is that contradictory?

            Comment


            • #7
              Tests and CIs are not quite the same thing. Tests work the the sampling distribution presumed to exist if the population fits with some null hypothesis is true, while confidence intervals presume a sampling distribution presuming a population based on the observed data. So, if the sample value of a proportion is 0.30 and you test it against a population value of 0.10, the p-value will be derived presumed the population proportion truly is 0.10, while the confidence interval will use the value of 0.30 to characterize the population. The formula-based standard errors would be sqrt(0.1*0.9/N) in the former case but sqrt(0.3*0.7/N) in the latter case.

              A further issue is that -bitest- uses an exact test, using the binomial distribution, while -proportion- does not use this unless you specify the -exact- option.

              Comment


              • #8
                Thanks a lot again Mike for clarifying that. Is there a reason you suggest bitest and not prtest? I think I once heard a general rule which stated that that bitest was most sufficient, when observations are less than 30 and prtest if observations are over 30, but I am not really sure about that.

                Comment


                • #9
                  My personal preference is always to prefer so-called "exact" procedures, since 21st century algorithms and computing equipment make them easy to do. But no, the over/under 30 is not a good rule for when the exact procedure might not make much difference. The better rules sometimes prescribe something like Pi*N > 15 and (1-Pi*N) > 15, where Pi is the *hypothesized* value for the population proportion.

                  Comment


                  • #10
                    Thanks a lot for the help Mike. You have been very helpful

                    Comment


                    • #11
                      Hi Mike my data goes from 1988-2002, do you know if there is a way to use clustered standard errors for proportion test (bitest/prtest/?).

                      Comment

                      Working...
                      X