Poisson distribution

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Poisson
Probability mass function
Plot of the Poisson PMF
The horizontal axis is the index k. The function is only defined at integer values of k. The connecting lines are only guides for the eye.
Cumulative distribution function
Plot of the Poisson CDF
The horizontal axis is the index k. The CDF is discontinuous at the integers of k and flat everywhere else because a variable that is Poisson distributed only takes on integer values.
notation: \mathrm{Pois}(\lambda)\,
parameters: λ > 0 (real)
support: k ∈ { 0, 1, 2, 3, ... }
pmf: \frac{\lambda^k}{k!}\cdot e^{-\lambda}
cdf: \frac{\Gamma(\lfloor k+1\rfloor, \lambda)}{\lfloor k\rfloor !}\! for k\ge 0 or e^{-\lambda} \sum_{i=0}^{k} \frac{\lambda^i}{i!}\

(where \Gamma(x, y)\,\! is the Incomplete gamma function and \lfloor k\rfloor is the floor function)

mean: \lambda\,\!
median: \approx\lfloor\lambda+1/3-0.02/\lambda\rfloor
mode: \lceil\lambda\rceil - 1
variance: \lambda\,\!
skewness: \lambda^{-1/2}\,
ex.kurtosis: \lambda^{-1}\,
entropy: \lambda[1\!-\!\log(\lambda)]\!+\!e^{-\lambda}\sum_{k=0}^\infty \frac{\lambda^k\log(k!)}{k!}

(for large λ) \frac{1}{2}\log(2 \pi e \lambda) - \frac{1}{12 \lambda} - \frac{1}{24 \lambda^2} -
                     \frac{19}{360 \lambda^3} + O(\frac{1}{\lambda^4})

mgf: \exp(\lambda (e^{t}-1))\,
cf: \exp(\lambda (e^{it}-1))\,

In probability theory and statistics, the Poisson distribution (pronounced [pwasɔ̃]) (or Poisson law of small numbers[1]) is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event. (The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume.)

The distribution was first introduced by Siméon Denis Poisson (1781–1840) and published, together with his probability theory, in 1838 in his work Recherches sur la probabilité des jugements en matière criminelle et en matière civile (“Research on the Probability of Judgments in Criminal and Civil Matters”). The work focused on certain random variables N that count, among other things, the number of discrete occurrences (sometimes called “arrivals”) that take place during a time-interval of given length.

If the expected number of occurrences in this interval is λ, then the probability that there are exactly k occurrences (k being a non-negative integer, k = 0, 1, 2, ...) is equal to

f(k; \lambda)=\frac{\lambda^k e^{-\lambda}}{k!},\,\!

where

As a function of k, this is the probability mass function. The Poisson distribution can be derived as a limiting case of the binomial distribution.

The Poisson distribution can be applied to systems with a large number of possible events, each of which is rare. A classic example is the nuclear decay of atoms.

The Poisson distribution is sometimes called a Poissonian, analogous to the term Gaussian for a Gauss or normal distribution.

Contents

[edit] Poisson noise and characterizing small occurrences

The parameter λ is not only the mean number of occurrences E[k], but also its variance \sigma_k^2=E[k^2]-E[k]^2 (see Table). Thus, the number of observed occurrences fluctuates about its mean λ with a standard deviation \sigma_k =\sqrt{\lambda}. These fluctuations are denoted as Poisson noise or (particularly in electronics) as shot noise.

The correlation of the mean and standard deviation in counting independent discrete occurrences is useful scientifically. By monitoring how the fluctuations vary with the mean signal, one can estimate the contribution of a single occurrence, even if that contribution is too small to be detected directly. For example, the charge e on an electron can be estimated by correlating the magnitude of an electric current with its shot noise. If N electrons pass a point in a given time t on the average, the mean current is I = eN / t; since the current fluctuations should be of the order \sigma_I=e\sqrt{N}/t (i.e. the standard deviation of the Poisson process), the charge e can be estimated from the ratio \sigma_I^2/I. An everyday example is the graininess that appears as photographs are enlarged; the graininess is due to Poisson fluctuations in the number of reduced silver grains, not to the individual grains themselves. By correlating the graininess with the degree of enlargement, one can estimate the contribution of an individual grain (which is otherwise too small to be seen unaided). Many other molecular applications of Poisson noise have been developed, e.g., estimating the number density of receptor molecules in a cell membrane.


    \Pr(N_t=k) = f(k;\lambda t) = \frac{e^{-\lambda t} (\lambda t)^k}{k!}

[edit] Related distributions

X_i \left|\sum_{j=1}^n X_j\right. \sim \mathrm{Binom}\left(\sum_{j=1}^nX_j,\frac{\lambda_i}{\sum_{j=1}^n\lambda_j}\right)
F_\mathrm{Poisson}(x;\lambda) \approx F_\mathrm{normal}(x;\mu=\lambda,\sigma^2=\lambda)\,

[edit] Occurrence

The Poisson distribution arises in connection with Poisson processes. It applies to various phenomena of discrete properties (that is, those that may happen 0, 1, 2, 3, ... times during a given period of time or in a given area) whenever the probability of the phenomenon happening is constant in time or space. Examples of events that may be modelled as a Poisson distribution include:

[edit] How does this distribution arise? — The law of rare events

Comparison of the Poisson distribution (black dots) and the binomial distribution with n=10 (red line), n=20 (blue line), n=1000 (green line). All distributions have a mean of 5. The horizontal axis shows the number of events k. Notice that as n gets larger, the Poisson distribution becomes an increasingly good approximation for the binomial distribution with the same mean.

In several of the above examples—for example, the number of mutations in a given sequence of DNA—the events being counted are actually the outcomes of discrete trials, and would more precisely be modelled using the binomial distribution, that is

X \sim \textrm{B}(n,p). \,

In such cases n is very large and p is very small (and so the expectation np is of intermediate magnitude). Then the distribution may be approximated by the less cumbersome Poisson distribution

X \sim \textrm{Pois}(np). \,

This is sometimes known as the law of rare events, since each of the n individual Bernoulli events rarely occurs. The name may be misleading because the total count of success events in a Poisson process need not be rare if the parameter np is not small. For example, the number of telephone calls to a busy switchboard in one hour follows a Poisson distribution with the events appearing frequent to the operator, but they are rare from the point of view of the average member of the population who is very unlikely to make a call to that switchboard in that hour.

[edit] Proof

We will prove that, for fixed λ, if

X_n \sim \textrm{B}(n,\lambda /n); \qquad Y\sim\textrm{Pois}(\lambda). \,

then for each fixed k

\lim_{n\to\infty}P(X_n=k) = P(Y=k).

To see the connection with the above discussion, for any Binomial random variable with large n and small p set λ = np. Note that the expectation E(Xn) = λ is fixed with respect to n.

First, recall from calculus

\lim_{n\to\infty}\left(1-{\lambda \over n}\right)^n=e^{-\lambda},

then since p = λ / n in this case, we have


\begin{align}

\lim_{n\to\infty} P(X_n=k)&=\lim_{n\to\infty}{n \choose k} p^k (1-p)^{n-k} \\
 &=\lim_{n\to\infty}{n! \over (n-k)!k!} \left({\lambda \over n}\right)^k \left(1-{\lambda\over n}\right)^{n-k}\\
&=\lim_{n\to\infty}
\underbrace{\left[\frac{n!}{n^k\left(n-k\right)!}\right]}_{A_n}
\left(\frac{\lambda^k}{k!}\right)
\underbrace{\left(1-\frac{\lambda}{n}\right)^n}_{\to\exp\left(-\lambda\right)}
\underbrace{\left(1-\frac{\lambda}{n}\right)^{-k}}_{\to 1} \\
&= \left[ \lim_{n\to\infty} A_n \right] \left(\frac{\lambda^k}{k!}\right)\exp\left(-\lambda\right)
\end{align}

Next, note that


\begin{align}
A_n
  &= \frac{n!}{n^k\left(n-k\right)!}\\
  &= \frac{n\cdot (n-1)\cdots \big(n-(k-1)\big)}{n^k}\\
  &= 1\cdot(1-\tfrac{1}{n})\cdots(1-\tfrac{k-1}{n})\\
  &\to 1\cdot 1\cdots 1 = 1,
\end{align}

where we have taken the limit of each of the terms independently, which is permitted since there is a fixed number of terms with respect to n (there are k of them). Consequently, we have shown that

\lim_{n\to\infty}P(X_n=k) = \frac{\lambda^k \exp\left(-\lambda\right)}{k!} = P(Y=k).

[edit] Generalization

We have shown that if

X_n \sim \textrm{B}(n,p_n); \qquad Y\sim\textrm{Pois}(\lambda), \,

where pn = λ / n, then X_n\to Y in distribution. This holds in the more general situation that pn is any sequence such that

\lim_{n\rightarrow\infty} np_n = \lambda.

[edit] 2-dimensional Poisson process

 P(N(D)=k)=\frac{(\lambda|D|)^k e^{-\lambda|D|}}{k!}

where

[edit] Properties

If X_i \sim \mathrm{Pois}(\lambda_i)\, follow a Poisson distribution with parameter \lambda_i\, and Xi are independent, then
Y = \sum_{i=1}^N X_i \sim \mathrm{Pois}\left(\sum_{i=1}^N \lambda_i\right)\,
also follows a Poisson distribution whose parameter is the sum of the component parameters. A converse is Raikov's theorem, which says that if the sum of two independent random variables is Poisson-distributed, then so is each of those two independent random variables.
\mathrm{E}\left(e^{tX}\right)=\sum_{k=0}^\infty e^{tk} f(k;\lambda)=\sum_{k=0}^\infty e^{tk} {\lambda^k e^{-\lambda} \over k!} =e^{\lambda(e^t-1)}.
D_{\mathrm{KL}}(\lambda\|\lambda_0) = \lambda_0 - \lambda + \lambda \log \frac{\lambda}{\lambda_0}.

[edit] Generating Poisson-distributed random variables

A simple way to generate random Poisson-distributed numbers is given by Knuth, see References below.

algorithm poisson random number (Knuth):
    init:
         Let L ← e−λ, k ← 0 and p ← 1.
    do:
         k ← k + 1.
         Generate uniform random number u in [0,1] and let p ← p × u.
    while p > L.
    return k − 1.

While simple, the complexity is linear in λ. There are many other algorithms to overcome this. Some are given in Ahrens & Dieter, see References below. Also, for large values of λ, there may be numerical stability issues because of the term e−λ. One solution for large values of λ is Rejection sampling, another is to use a Gaussian approximation to the Poisson.

[edit] Parameter estimation

[edit] Maximum likelihood

Given a sample of n measured values ki we wish to estimate the value of the parameter λ of the Poisson population from which the sample was drawn. To calculate the maximum likelihood value, we form the log-likelihood function


\begin{align}
L(\lambda) & = \ln \prod_{i=1}^n f(k_i \mid \lambda) \\
& = \sum_{i=1}^n \ln\!\left(\frac{e^{-\lambda}\lambda^{k_i}}{k_i!}\right) \\
& = -n\lambda + \left(\sum_{i=1}^n k_i\right) \ln(\lambda) - \sum_{i=1}^n \ln(k_i!). \end{align}

Take the derivative of L with respect to λ and equate it to zero:

\frac{\mathrm{d}}{\mathrm{d}\lambda} L(\lambda) = 0
\iff -n + \left(\sum_{i=1}^n k_i\right) \frac{1}{\lambda} = 0. \!

Solving for λ yields a stationary point, which if the second derivative is negative is the maximum-likelihood estimate of λ:

\widehat{\lambda}_\mathrm{MLE}=\frac{1}{n}\sum_{i=1}^n k_i. \!

Checking the second derivative, it is found that it is negative for all λ and ki greater than zero, therefore this stationary point is indeed a maximum of the initial likelihood function:

\frac{\partial^2 L}{\partial \lambda^2} =  \sum_{i=1}^n -\lambda^{-2} k_i

Since each observation has expectation λ so does this sample mean. Therefore it is an unbiased estimator of λ. It is also an efficient estimator, i.e. its estimation variance achieves the Cramér–Rao lower bound (CRLB). Hence it is MVUE. Also it can be proved that the sample mean is complete and sufficient statistic for λ.

[edit] Bayesian inference

In Bayesian inference, the conjugate prior for the rate parameter λ of the Poisson distribution is the Gamma distribution. Let

\lambda \sim \mathrm{Gamma}(\alpha, \beta) \!

denote that λ is distributed according to the Gamma density g parameterized in terms of a shape parameter α and an inverse scale parameter β:

 g(\lambda \mid \alpha,\beta) = \frac{\beta^{\alpha}}{\Gamma(\alpha)} \; \lambda^{\alpha-1} \; e^{-\beta\,\lambda} \qquad \text{ for } \lambda>0 \,\!.

Then, given the same sample of n measured values ki as before, and a prior of Gamma(α, β), the posterior distribution is

\lambda \sim \mathrm{Gamma}(\alpha + \sum_{i=1}^n k_i, \beta + n). \!

The posterior mean E[λ] approaches the maximum likelihood estimate \widehat{\lambda}_\mathrm{MLE} in the limit as \alpha\to 0,\ \beta\to 0.

The posterior predictive distribution of additional data is a Gamma-Poisson (i.e. negative binomial) distribution.

[edit] The "law of small numbers"

The word law is sometimes used as a synonym of probability distribution, and convergence in law means convergence in distribution. Accordingly, the Poisson distribution is sometimes called the law of small numbers because it is the probability distribution of the number of occurrences of an event that happens rarely but has very many opportunities to happen. The Law of Small Numbers is a book by Ladislaus Bortkiewicz about the Poisson distribution, published in 1898. Some historians of mathematics have argued that the Poisson distribution should have been called the Bortkiewicz distribution.[6]

[edit] See also

[edit] Notes

  1. ^ Gullberg, Jan (1997). Mathematics from the birth of numbers. New York: W. W. Norton. pp. 963–965. ISBN 039304002X. 
  2. ^ NIST/SEMATECH, '6.3.3.1. Counts Control Charts', e-Handbook of Statistical Methods, accessed 25 October 2006
  3. ^ McCullagh, Peter; Nelder, John (1989). Generalized Linear Models. London: Chapman and Hall. ISBN 0-412-31760-5.  page 196 gives the approximation and the subsequent terms.
  4. ^ Johnson, N.L., Kotz, S., Kemp, A.W. (1993) Univariate Discrete distributions (2nd edition). Wiley. ISBN 0-471-54897-9, p163
  5. ^ Box, Hunter and Hunter. Statistics for experimenters. Wiley. p. 57. 
  6. ^ Good, I. J. (1986). "Some statistical applications of Poisson's work". Statistical Science 1 (2): 157–180. doi:10.1214/ss/1177013690. http://www.jstor.org/stable/2245435. 

[edit] References

[edit] External links


Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages