ML estimation of Zipf-Mandelbrot law

Alexander Koplenig

Join Date: Jul 2014

Posts: 39
#1

ML estimation of Zipf-Mandelbrot law

02 Jul 2014, 04:56

Dear Statalisters,

I want to estimate by maximum likelihood the parameters of the Zipf-Mandelbrot "law" by fitting a list of frequencies to a right truncated zeta distribution.

The Zipf-Mandelbrot law (http://en.wikipedia.org/wiki/Zipf%E2...Mandelbrot_law) has two parameters alpha and beta.

If beta=0 then the Zipf-Mandelbrot law becomes the Zipf law. I already implemented this case and it works perfect and quite fast.
However, when I try to implement the ZM law, my program does not achieve convergence.

The log likelihood is defined as:

L = (-1)*alpha * SUM(f(r) * log(rank+beta) - T*log(SUM(rank+beta)^alpha)

where SUM is the sum starting from rank one to the maximum rank N. f(r) is the frequency of the observation with rank r.
T is defined as the sum of all individual frequencies. alpha & beta are the parameters to estimate.

Here is the code I am using:

********************* START *********************
capture program drop zipf_mandelbrot
program zipf_mandelbrot
version 12.1
args lnf alpha beta
tempname H f_log_rank
quietly {

gen double `f_log_rank'=frequency*log(rank+`beta')
/* calculate the sum */
sum `f_log_rank'
scalar m=r(sum)

/* calculate the sum of all frequencies */
sum frequency
scalar T=r(sum)

generate double `H'=1/((rank+`beta')^`alpha')
/* calculate the log sum */
sum `H'
scalar H=log(r(sum))

/* calculate the contribution to log-likelihood */
replace `lnf'=(-1)*`alpha'*m - T*H
}
end

/* example data set */
sysuse auto, clear
gen frequency=price
keep frequency

/* generate a variable that contains the ranks */
gsort -frequency
gen rank=_n

ml model lf zipf_mandelbrot () ()
ml search
ml maximize, iterate(1000)

********************* END *********************

Various tests with different data sets using both my Stata program and an R package (using mle) show that my program sometimes gets (very close to) the right estimates, but does not converge and is generally very slow. [For the example above, R estimates alpha=0.1206387 & beta=83.2256729]
Since this is my fist time that I am working with Stata's ml command (and with ml estimation in general), I am not sure what is going wrong.
So if anyone can help me, please let me know, I would be very grateful.

Many thanks

Ali

Last edited by Alexander Koplenig; 02 Jul 2014, 05:33.
Tags: None
Alexander Koplenig

Join Date: Jul 2014

Posts: 39
#2

16 Jul 2014, 07:09

Dear Statalisters,

Since I was not able to solve the problem myself, I still would be very grateful for any ideas.
I think there could be a very simple error somewhere in the code presented above, maybe the ml model, search and maximize options were not used correctly?

In an R package (called 'tolerance' by DS Young (http://cran.r-project.org/web/packag.../tolerance.pdf)) the corresponding log likelihood is defined as

ll.zima <- function(alpha, beta) sum(frequency * (alpha * log(rank + beta) +
log(sum(1/(rank + beta)^alpha))))

Has anyone an idea how to fit the same log likelihood with Stata?

Thanks in advance

Ali
Comment
Alexander Koplenig

Join Date: Jul 2014

Posts: 39
#3

22 Jul 2014, 03:40

Dear Statalisters,

I think I finally found a solution that I want to share with anyone interested.

Ali

Here is the code:
********************* START *********************
capture program drop zipf_mandelbrot
program zipf_mandelbrot
version 12.1
args lnf alpha beta
tempname H
quietly {
/* calculate the harmonic number */
egen double `H'=total(1/((rank+`beta')^`alpha'))

/* calculate the contribution to log-likelihood */
replace `lnf'=(-1)*freq*`alpha'*log(rank+`beta')-freq*log(`H')
}
end

/* example data sets */
sysuse auto, clear
gen frequency=mpg
keep frequency

/* generate a variable that contains the ranks */
gsort -frequency
gen rank=_n

/* fit the model */
ml model lf zipf_mandelbrot (alpha: ) (beta: )
ml search
ml maximize, iterate(100) difficult

********************* END *********************
Comment

Announcement

ML estimation of Zipf-Mandelbrot law

Comment

Comment