Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fitting Data into Power Distribution: Estimating alpha and x0

    Hi,

    I have a data that I want to fit into a power distribution. When I use "paretofit" command, it gives me an estimate for alpha but not for x0. To derive x0, I read "Power-Law Distributions in Empirical Data" paper by Clauset, Shalizi and Newman. They use Kolmogorov–Smirnov statistic to achieve this. However, I don't know how to add this statistic to "paretofit". Could you please help me to estimate alpha and x0 together?

    Thanks in advance.

    Ulas

  • #2
    It's not a matter of estimating the 2 parameters together in one step. Fit your model using -paretofit- for multiple thresholds over a plausible range, and from this, calculate the K-S statistics.

    For example, see page 277 and following of: Jenkins, S.P. ‘Pareto models, top incomes, and recent trends in UK income inequality’, Economica, 84 (334), April 2017, 261–289. (Pre-publication versions available/downloadable if the journal is behind a pay-wall for you.)

    PS please give full references to papers. See the Forum FAQ on why this is needed. Your full reference is, I guess: CLAUSET, A., SHALIZA, C. R. and NEWMAN, M. E. J. (2009). Power-law distributions in empirical data. SIAM Review, 51, 661–703.

    Comment


    • #3
      Thank you very much for your response Dr. Jenkins. I am trying to test if the data fits a power distribution with alpha∽1. However, after reading your post, I understood that I should specify multiple x0 and pick the one that gives me alpha∽1 and then test whether this distribution is statistically significantly different than Power distribution with alpha∽1 by using K-S statistics. Did I get it correctly? and should I be worried about distributions with alpha=0.99 or alpha=1.01 etc.?

      I will pay more attention to give full references next time. Thanks for pointing out.

      Best Regards,
      Ulas

      Comment


      • #4
        Sometimes one parameter is given directly as the minimum possible value, which doesn't have to be estimated. There isn't enough information -- because you give none at all -- about whether that is so for your data. A simple example is something counted, where a count has to be 1 for an item to appear in the data in any case, but that is the evident minimum.

        Comment


        • #5
          Ulas Alk in #3 wrote

          I understood that I should specify multiple x0 and pick the one that gives me alpha∽1 and then test whether this distribution is statistically significantly different than Power distribution with alpha∽1 by using K-S statistics. Did I get it correctly? and should I be worried about distributions with alpha=0.99 or alpha=1.01 etc.?
          What you write suggests that you haven't fully digested the reading material. You should fit multiple models, choosing a different x0 each time (x0 should cover a relevant range). And then implement the test as my article did -- remembering that I simply did what Clauset et al. did. I don't understand the question about being "worried".

          Comment


          • #6
            Thanks for your replies. To be more clear, I am sharing my code. I am trying to find an x0 that gives me alpha between 0.9 and 1.1. Initially, I was trying to estimate both of them together but understood that this is not possible since x0 is an input in this model and takes minimum value as default (Please correct me if I am wrong). So, I am running the following code for this purpose and I guess it is working right now:

            //Data is between 2000 and 2018
            //var 1 is ranging from 0 to 10,000
            matrix C = J(19,2,0) //First column is alpha and second column is x0 that gives an appropriate alpha

            levelsof year, local(levels)
            foreach i of local levels {

            forvalues z =0(10)100 {
            capt paretofit var1 if year == `i', x0(`z')
            matrix C[`i'+1-2000,1] = e(ba)
            matrix C[`i'+1-2000,2] = `z'
            if (0.9<=e(ba) & e(ba)<=1.1) {
            continue, break
            }
            }
            }

            Finally, I will do KS test for each year by using these values. It is possible that some other x0 and alpha would fit better than the first values that give me these matrix (I exit the loop once I get an alpha in that range). I will try it as a next step if test results reject that distributions are same.
            Last edited by ulas alk; 06 Nov 2019, 16:20.

            Comment

            Working...
            X