Yule-Simon distribution

Jump to: navigation, search
Yule-Simon
Probability mass function
Plot of the Yule-Simon PMF
Yule-Simon PMF on a log-log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
Cumulative distribution function
Plot of the Yule-Simon CMF
Yule-Simon CMF. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
Parameters <math>\rho>0\,</math> shape (real)
Support <math>k \in \{1,2,\dots\}\,</math>
Probability mass function (pmf) <math>\rho\,\mathrm{B}(k, \rho+1)\,</math>
Cumulative distribution function (cdf) <math>1 - k\,\mathrm{B}(k, \rho+1)\,</math>
Mean <math>\frac{\rho}{\rho-1}\,</math> for <math>\rho>1\,</math>
Median
Mode <math>1\,</math>
Variance <math>\frac{\rho^2}{(\rho-1)^2\;(\rho-2)}\,</math> for <math>\rho>2\,</math>
Skewness <math>\frac{(\rho+1)^2\;\sqrt{\rho-2
Excess kurtosis {{{kurtosis}}}
Entropy {{{entropy}}}
Moment-generating function (mgf) {{{mgf}}}
Characteristic function {{{char}}}
{(\rho-3)\;\rho}\,</math> for <math>\rho>3\,</math>|
 kurtosis   =<math>\rho+3+\frac{11\rho^3-49\rho-22} {(\rho-4)\;(\rho-3)\;\rho}\,</math> for <math>\rho>4\,</math>|
 entropy    =|
 mgf        =<math>\frac{\rho}{\rho+1}\;{}_2F_1(1,1; \rho+2; e^t)\,e^t \,</math>|
 char       =<math>\frac{\rho}{\rho+1}\;{}_2F_1(1,1; \rho+2; e^{i\,t})\,e^{i\,t} \,</math>|

}} In probability and statistics, the Yule-Simon distribution is a discrete probability distribution named after Udny Yule and Herbert Simon. Simon originally called it the Yule distribution.

The probability mass function of the Yule-Simon(ρ) distribution is

<math>f(k;\rho) = \rho\,\mathrm{B}(k, \rho+1), \,</math>

for integer <math>k \geq 1</math> and real <math>\rho > 0</math>, where <math>\mathrm{B}</math> is the beta function. Equivalently the pmf can be written in terms of the falling factorial as

<math>
f(k;\rho) = \frac{\rho\,\Gamma(\rho+1)}{(k+\rho)^{\underline{\rho+1}}}
,

\,</math>

where <math>\Gamma</math> is the gamma function. Thus, if <math>\rho</math> is an integer,

<math>
f(k;\rho) = \frac{\rho\,\rho!\,(k-1)!}{(k+\rho)!}
.

\,</math>

The probability mass function f has the property that for sufficiently large k we have

<math>
f(k;\rho)
\approx \frac{\rho\,\Gamma(\rho+1)}{k^{\rho+1}}
\propto \frac{1}{k^{\rho+1}}
.

\,</math>

This means that the tail of the Yule-Simon distribution is a realization of Zipf's law: <math>f(k;\rho)</math> can be used to model, for example, the relative frequency of the <math>k</math>th most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of <math>k</math>.

Occurrence

The Yule-Simon distribution arises as a continuous mixture of geometric distributions. Specifically, assume that <math>W</math> follows an exponential distribution with scale <math>1/\rho</math> or rate <math>\rho</math>:

<math>W \sim \mathrm{Exponential}(\rho)\,</math>
<math>h(w;\rho) = \rho \, \exp(-\rho\,w)\,</math>

Then a Yule-Simon distributed variable <math>K</math> has the following geometric distribution:

<math>K \sim \mathrm{Geometric}(\exp(-W))\,</math>

The pmf of a geometric distribution is

<math>g(k; p) = p \, (1-p)^{k-1}\,</math>

for <math>k\in\{1,2,\dots\}</math>. The Yule-Simon pmf is then the following exponential-geometric mixture distribution:

<math>f(k;\rho)
= \int_0^{\infty} \,\,\, g(k;\exp(-w))\,h(w;\rho)\,dw

\,</math>

Generalizations

The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function. The probability mass function of the generalized Yule-Simon(ρ, α) distribution is defined as

<math>
f(k;\rho,\alpha) = \frac{\rho}{1-\alpha^{\rho}} \;
       \mathrm{B}_{1-\alpha}(k, \rho+1)
,
\,</math>

with <math>0 \leq \alpha < 1</math>. For <math>\alpha = 0</math> the ordinary Yule-Simon(ρ) distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.

File:Yule-Simon distribution.png
Plot of the Yule-Simon(1) distribution (red) and its asymptotic Zipf law (blue)

References

  • Herbert A. Simon, On a Class of Skew Distribution Functions, Biometrika 42(3/4): 425–440, December 1955.
  • Colin Rose and Murray D. Smith, Mathematical Statistics with Mathematica. New York: Springer, 2002, ISBN 0-387-95234-9. (See page 107, where it is called the "Yule distribution".)

Navigation WikiDoc | WikiPatient | Popular pages | Recently Edited Pages | Recently Added Pictures

Table of Contents In Alphabetical Order | By Individual Diseases | Signs and Symptoms | Physical Examination | Lab Tests | Drugs

Editor Tools Become an Editor | Editors Help Menu | Create a Page | Edit a Page | Upload a Picture or File | Printable version | Permanent link | Maintain Pages | What Pages Link Here
There is no pharmaceutical or device industry support for this site and we need your viewer supported Donations | Editorial Board | Governance | Licensing | Disclaimers | Avoid Plagiarism | Policies
Linked-in.jpg
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox