This brief lesson introduces the concept of discrete distribution (probability mass function) on the example of Poisson distribution. We will also do more likelihood stuff.

1 Poisson distribution

Poisson distribution is a simple model for discrete and positive count data.

\[p\left( x \right) = \frac{{e^{ - \lambda } \lambda ^x }}{{x!}}\]

Where \(\lambda\) is the only parameter, which is also the mean of the distribution.

2 Simulating poisson data

Let’s explore the Poisson model a bit and simulate some data using rpois():

rpois(n=100, lambda=1)
##   [1] 2 1 0 0 2 1 2 2 2 0 2 3 0 2 0 1 2 0 0 2 2 1 1 1 0 1 0 1 0 1 0 1 0 2 0
##  [36] 3 0 0 2 2 0 0 1 2 1 0 1 0 0 0 1 0 1 0 2 1 0 1 2 3 1 0 0 2 0 2 2 2 1 0
##  [71] 1 0 2 2 3 2 0 1 2 1 0 1 2 0 0 1 0 3 0 1 0 3 0 2 0 0 1 4 2 2

Now try different values of \(\lambda\).

We can plot the “empirical frequency histogram” for simulated data with 3 different values of parameter \(\lambda\):

  plot(c(0,35), c(0,1), type="n",
       xlab="x", ylab="Frequency")
  
  hist(rpois(n=100, lambda=1), add=T, freq=FALSE, col="blue")
  hist(rpois(n=100, lambda=5), add=T, freq=FALSE, col="red")
  hist(rpois(n=100, lambda=20), add=T, freq=FALSE, col="green")

plot of chunk unnamed-chunk-2

Now we can plot the respective probability mass functions:

  plot(0:35, dpois(0:35, lambda=1), 
       col="blue", pch=19, 
       ylab="Probability mass",
       xlab="x")
  points(0:35, dpois(0:35, lambda=5), col="red", pch=19)
  points(0:35, dpois(0:35, lambda=20), col="green", pch=19)

plot of chunk unnamed-chunk-3

3 Poisson likelihood

I will generate the artificial data again:

x <- rpois(n=100, lambda=8)
x
##   [1]  8  1  8  6  4  9  6  8  9 10 15  8  8 16  5  8  5  6  8  9  9 11  8
##  [24]  6  8  8  9  9  9  4 15  6 11  7 10  6  5 10 11  5 10  7  8  5  6  4
##  [47] 10  7  6  8  6  7  6  8  7 11  7 10 12 11 12  6  6  8  6  6 10  4  8
##  [70]  7  3  3 13  7  9  7 11 11  7  7  7  6 10  9 13  6  6  7 13  7 12  5
##  [93] 13  4 10  5 10  6  6 11

Poisson distribution is didactically convenient as it only has the one parameter \(\lambda\). We can write the deviance (negative log-likelihood) function for a given dataset \(x\):

  deviance.function <- function(x, lambda)
  {
    LL <- dpois(x, lambda, log=TRUE) # the log likelihood
    deviance <- -sum(LL)
    return(deviance)
  }

We can try different values of \(\lambda\):

  deviance.function(x, lambda=10)
## [1] 264.2

And we can evaluate the deviance.function for values of \(\lambda\) between 0 and 35:

  lambdas <- 0:35
  deviances <- numeric(36) # empty numeric vector
  
  for(i in 1:36)
  {
    deviances[i] <- deviance.function(x, lambdas[i])  
  }

Let’s check what we have:

  deviances
##  [1]    Inf 1199.3  746.9  523.7  394.5  316.6  271.3  248.4  242.0  248.1
## [11]  264.2  288.2  318.9  355.1  396.0  441.0  489.6  541.3  595.7  652.6
## [21]  711.7  772.8  835.8  900.3  966.4 1033.9 1102.6 1172.6 1243.6 1315.6
## [31] 1388.6 1462.4 1537.1 1612.6 1688.8 1765.7

Our maximum likelihood exstimate (MLE) of \(\lambda\) is:

MLE <- lambdas[which.min(deviances)] 
MLE
## [1] 8

And this is the plot of the deviance function, together with the MLE (vertical line):

plot(lambdas, deviances, ylab="Deviance (Negative Log Likelihood)")
abline(v=MLE)

plot of chunk unnamed-chunk-10