## The Poisson Gamma Mixture Pattern

Suppose a random variable $N$ has a frequency distribution that is Poisson with parameter $\lambda$. Suppose the parameter $\lambda$ is also a random variable and it has a gamma distribution with parameters $\alpha$ and $\theta$. Then $N$ is equivalent to a negative binomial with parameters $r = \alpha$ and $\beta = \theta$.

Note that

1. When $\alpha =1$, the gamma distribution is equivalent to an exponential distribution.
2. This also means the negative binomial has parameter $r=1$ which is equivalent to a geometric distribution.
Pop Quiz!
You own a space mining company and have sent several exploration bots to scout possible mineral rich asteroids.  Each bot discovers pockets of valuable resources on different asteroids at a rate of $\lambda$ per year.  The parameter $\lambda$ varies by bot according to an exponential distribution with parameter $\theta = 3$.
1. What is the expected number of discoveries per year for a bot chosen at random?
2. What is the variance?

Filed under Frequency Models

## Frequency Models

Frequency models count the number of times an event occurs.

1. The number of customers to arrive each hour.
2. The number of coins lucky Tom finds on his way home from school.
3. How many scientists a Tyrannosaur eats on a certain day.
4. Etc.
This is in contrast to a severity model which measures the magnitude of an event.
1. How much a customer spends.
2. The value of a coin that lucky Tom finds.
3. The number of calories each scientist provides.
4. Etc.
The following distributions are used to model event frequency.  For notation, $p_n$ means $Pr(N=n)$.

## Poisson:

$\begin{array}{lr}\displaystyle p_n = e^{-\lambda} \frac{\lambda^n}{n!} & \lambda > 0 \end{array}$
Properties:
1. Parameter is $\lambda$.
2. Mean is $\lambda$.
3. Variance is $\lambda$.
4. If $N_1, N_2, ..., N_i$ are Poisson with parameters $\lambda_1, \lambda_2, ..., \lambda_i$, then $N = N_1 + N_2 + ... + N_i$ is Poisson with parameter $\lambda = \lambda_1 + \lambda_2 + ... + \lambda_i$.

## Negative Binomial:

$\begin{array}{lr} \displaystyle p_n = {{n+r-1}\choose{n}}\left(\frac{1}{1+\beta}\right)^r\left(\frac{\beta}{1+\beta}\right)^n & \beta>0, r>0 \end{array}$
Properties:
1. Parameters are $r$ and $\beta$.
2. Mean is $r\beta$.
3. Variance is $r\beta\left(1+\beta\right)$.
4. Variance is always greater than the mean.
5. Is equal to a Geometric distribution when $r=1$.
6. If $N_1, N_2, ..., N_i$ are negative binomial with parameters $\beta_1 = \beta_2 = ... = \beta_i$ and $r_1, r_2, ..., r_i$, then the sum $N = N_1 + N_2 + ... + N_i$ is negative binomial and has parameters $\beta = \beta_1$ and $r = r_1+r_2+...+r_i$.  Note: $\beta$‘s must be the same.

## Geometric:

$\begin{array}{lr} \displaystyle p_n = \frac{\beta^n}{\left(1+\beta\right)^{n+1}} & \beta>0 \end{array}$
Properties:
1. Parameter is $\beta$.
2. Mean is $\beta$.
3. Variance is $\beta\left(1+\beta\right)$.
4. If $N_1, N_2, ..., N_i$ are geometric with parameter $\beta$, then the sum $N = N_1+N_2+...+N_i$ is negative binomial with parameters $\beta$ and $r = i$.

## Binomial:

$\displaystyle p_n = {{m} \choose {n}}q^n\left(1-q\right)^{m-n}$
where $m$ is a positive integer, $0.
Properties:
1. Parameters are $m$ and $q$.
2. Mean is $mq$.
3. Variance is $mq\left(1-q\right)$.
4. Variance is always less than mean.
5. If $N_1, N_2, ..., N_i$ is binomial with parameters $q$ and $m_1, m_2, ..., m_i$, then the sum $N=N_1+N_2+...+N_i$ is binomial with parameters $q$ and $m = m_1+m_2+...+m_i$.

## The (a,b,0) recursion:

These distributions can be reparameterized into a recursive formula with parameters $a$ and $b$.  When reparameterized, they all have the same recursive format.
$\displaystyle p_k = \left(a+ \frac{b}{k}\right)p_{k-1}$
It is more common to write
$\displaystyle \frac{p_k}{p_k-1} = a+\frac{b}{k}$
The parameters $a$ and $b$ are different for each distribution.
1. Poisson:
$a = 0$ and $b =\lambda$.
2. Negative Binomial:
$\displaystyle a = \frac{\beta}{1+\beta}$ and $\displaystyle b = \left(r-1\right)\frac{\beta}{1+\beta}$.
3. Geometric:
$\displaystyle a = \frac{\beta}{1+\beta}$ and $\displaystyle b = 0$.
4. Binomial:
$\displaystyle a = -\frac{q}{1-q}$ and $\displaystyle b = \left(m+1\right)\frac{q}{1-q}$.
Pop Quiz!
1. A frequency distribution has a = 0.8 and b = 1.2.  What distribution is this?
Answer: Negative Binomial because both parameters are positive.
2. A frequency distribution has mean 1 and variance 0.5.  What distribution is this?
Answer: Binomial because the variance is less than the mean.

Filed under Frequency Models, Probability

## Bonuses, Dividends, and Refunds

If a policy pays a reward to the participants when losses are below a certain level, this is a particular type of  problem which Weishaus calls a “bonus” problem.  The bonus, dividend, or refund amount is expressed as a maximum between 0 and the refunded amount.  For example, a 15% refund is paid on the difference between the $100 premium and the loss $L$. No refund is paid if losses exceed$100.  The refund amount $R$ can be expressed as

$R = 0.15 \max (0, 100-L)$

The key to finding the expected refund is knowing how to manipulate the max function and rewrite it as a min.  We can rewrite as

$\begin{array}{rll} R &=& 0.15 \max (100-100,100-L) \\ &=& 0.15(100-\min (100,L)) \end{array}$

So the expected value is given by

$E[R] = 0.15(100 - E[L \wedge 100])$

Filed under Coverage Modifications

## Other Coverage Modifications

Coinsurance $\alpha$ is the fraction of losses covered by the policy.  For example, $\alpha = 0.8$ means if a loss is incurred, 80% will be paid by the insurance company.  A claims limit $u$ is the maximum amount that will be paid.  The order in which coinsurance, claims limits, and deductibles is applied to a loss is important and will be specified by the problem.  The expected payment per loss when all three are present in a policy is given by

$E\left[Y\right] = \alpha \left[E\left[X\wedge u\right] - E\left[X \wedge d\right]\right]$

where $Y$ is the payment variable and $X$ is the original loss variable.  The second moment is given by

$E\left[Y^2\right] = \alpha^2\left(E\left[(X\wedge u)^2\right] - E\left[(X \wedge d)^2\right]-2d\left(E\left[X \wedge u\right]-E\left[X \wedge d\right]\right)\right)$

The second moment can be used to find the variance of payment per loss.  If inflation $r$ is present, multiply the second moment by $(1+r)^2$ and divide $u$ and $d$ by $(1+r)$.   For payment per payments, divide the expected values by $P(X>d)$ or $1-F(d)$.

## The Loss Elimination Ratio

If you impose a deductible $d$ on an insurance policy that you’ve written, what fraction of expected losses do you eliminate from your expected liability?  This is measured by the Loss Elimination Ratio $LER(d)$.

$\displaystyle LER(d) = \frac{E\left[X \wedge d\right]}{E\left[X\right]}$

Definitions:

1. Ordinary deductible $d$— The payment made by the writer of the policy is the loss $X$ minus the deductible $d$.  If the loss is less than $d$, then nothing is paid.
2. Franchise deductible $d_f$—  The payment made by the writer of the policy is the complete amount of the loss $X$ if $X$ is greater than $d_f$.
A common type of question considers what happens to LER if an inflation rate $r$ increases the amount of all losses, but the deductible remains unadjusted.  Let $X$ be the loss variable.  Then $Y=(1+r)X$ is the inflation adjusted loss variable.  If losses $Y$ are subject to deductible $d$, then
$\begin{array}{rll} \displaystyle LER_Y(d) &=& \frac{E\left[(1+r)X\wedge d\right]}{E\left[(1+r)X\right]} \\ \\ \displaystyle &=&\frac{(1+r)E\left[X\wedge \frac{d}{1+r}\right]}{(1+r)E\left[X\right]} \\ \\ &=& \frac{E\left[X \wedge \frac{d}{1+r}\right]}{E\left[X\right]}\end{array}$
Memorize:
$\displaystyle E\left[X \wedge d\right] = \int_0^d{x f(x) dx} + d\left(1-F(x)\right)$

## The Lognormal Distribution

Review: If $X$ is normal with mean $\mu$ and standard deviation $\sigma$, then

$Z = \displaystyle \frac{X-\mu}{\sigma}$

is the Standard Normal Distribution with mean 0 and standard deviation 1.  To find the probability $Pr(X \le x)$, you would convert $X$ to the standard normal distribution and look up the values in the standard normal table.

$\begin{array}{rll} Pr(X \le x) &=& Pr\left(\displaystyle \frac{X-\mu}{\sigma} \le \frac{x-\mu}{\sigma}\right) \\ \\ &=& \displaystyle Pr\left(Z \le \frac{x-\mu}{\sigma}\right) \\ \\ &=& \displaystyle \mathcal{N}\left(\frac{x-\mu}{\sigma}\right) \end{array}$

If $V$ is a weighted sum of $n$ normal random variables $X_i, i = 1, ..., n$, with means $\mu_i$, variance $\sigma^2_i$, and weights $w_i$, then

$\displaystyle E\left[\sum_{i=1}^n w_iX_i\right] = \sum_{i=1}^n w_i\mu_i$

and variance

$\displaystyle Var\left(\sum_{i=1}^n w_iX_i\right) = \sum_{i=1}^n \sum_{j=1}^n w_iw_j\sigma_{ij}$

where $\sigma_{ij}$ is the covariance between $X_i$ and $X_j$.  Note when $i=j$, $\sigma_{ij} = \sigma_i^2 = \sigma_j^2$.

Remember: A sum of random variables is not the same as a mixture distribution!  The expected value is the same, but the variance is not.  A sum of normal random variables is also normal.  So $V$ is normal with the above mean and variance.

Actuary Speak: This is called a stable distribution.  The sum of random variables from the same distribution family produces a random variable that is also from the same distribution family.

The fun stuff:
If $X$ is normal, then $Y = e^X$ is lognormal.  If $X$ has mean $\mu$ and standard deviation $\sigma$, then

$\begin{array}{rll} \displaystyle E\left[Y\right] &=& E\left[e^X\right] \\ \\ \displaystyle &=& e^{\mu + \frac{1}{2}\sigma^2} \\ \\ Var\left(e^X\right) &=& e^{2\mu + \sigma^2}\left(e^{\sigma^2} - 1\right)\end{array}$

Recall $FV = e^\delta$ where $FV$ is the future value of an investment growing at a continuously compounded rate of $\delta$ for one period.  If the rate of growth is a normal distributed random variable, then the future value is lognormal.  The Black-Scholes model for option prices assumes stocks appreciate at a continuously compounded rate that is normally distributed.

$S_t = S_0e^{R(0,t)}$

where $S_t$ is the stock price at time $t$, $S_0$ is the current price, and $R(0,t)$ is the random variable for the rate of return from time 0 to t.  Now consider the situation where $R(0,t)$ is the sum of iid normal random variables $R(0,h) + R(h,2h) + ... + R((n-1)h,t)$ each having mean $\mu_h$ and variance $\sigma_h^2$.  Then

$\begin{array}{rll} E\left[R(0,t)\right] &=& n\mu_h \\ Var\left(R(0,t)\right) &=& n\sigma_h^2 \end{array}$

If $h$ represents 1 year, this says that the expected return in 10 years is 10 times the one year return and the standard deviation is $\sqrt{10}$ times the annual standard deviation.  This allows us to formulate a function for the mean and standard deviation with respect to time.  Suppose we write

$\begin{array}{rll} \displaystyle \mu(t) &=& \left(\alpha - \delta -\frac{1}{2}\sigma^2\right)t \\ \sigma(t) &=& \sigma \sqrt{t} \end{array}$

where $\alpha$ is the growth factor and $\delta$ is the continuous rate of dividend payout.  Since all normal random variables are transformations of the standard normal, we can write $R(0,t) =\mu(t)+Z\sigma(t)$ . The model for the stock price becomes

$\displaystyle S_t = S_0e^{\left(\alpha - \delta - \frac{1}{2}\sigma^2\right)t + Z\sigma\sqrt{t}}$

In this model, the expected value of the stock price at time $t$ is

$E\left[S_t\right] = S_0e^{(\alpha - \delta)t}$

Actuary Speak: The standard deviation $\sigma$ of the return rate is called the volatility of the stock.  This term comes from expressing the rate of return as an Ito process. $\mu(t)$ is called the drift term and $\sigma(t)$ is called the volatility term.

Confidence intervals: To find the range of stock prices that corresponds to a particular confidence interval, we need only look at the confidence interval on the standard normal distribution then translate that interval into stock prices using the equation for $S_t$.

Example: For example $z=[-1.96, 1.96]$ represents the 95% confidence interval in the standard normal $\mathcal{N}(z)$.  Suppose $t = \frac{1}{3}$, $\alpha = 0.15$, $\delta = 0.01$, $\sigma = 0.3$, and $S_0 = 40$.  Then the 95% confidence interval for $S_t$ is

$\left[40e^{(0.15-0.01-\frac{1}{2}0.3^2)\frac{1}{3} + (-1.96)0.3\sqrt{\frac{1}{3}}},40e^{(0.15-0.01-\frac{1}{2}0.3^2)\frac{1}{3} + (1.96)0.3\sqrt{\frac{1}{3}}}\right]$

Which corresponds to the price interval of

$\left[29.40,57.98\right]$

Probabilities: Probability calculations on stock prices require a bit more mental gymnastics.

$\begin{array}{rll} \displaystyle Pr\left(S_t

Conditional Expected Value: Define

$\begin{array}{rll} \displaystyle d_1 &=& -\frac{\ln{\frac{K}{S_0}} - \left(\alpha - \delta + \frac{1}{2}\sigma^2\right)t}{\sigma\sqrt{t}} \\ \\ \displaystyle d_2 &=& -\frac{\ln{\frac{K}{S_0}}- \left(\alpha - \delta - \frac{1}{2}\sigma^2\right)t}{\sigma\sqrt{t}} \end{array}$

Then

$\begin{array}{rll} \displaystyle E\left[S_t|S_tK\right] &=& S_0e^{(\alpha - \delta)t}\frac{\mathcal{N}(d_1)}{\mathcal{N}(d_2)} \end{array}$

This gives the expected stock price at time $t$ given that it is less than $K$ or greater than $K$ respectively.

Black-Scholes formula: A call option $C_t$ on stock $S_t$ has value $\max\left(0,S_t - K\right)$ at time $t$.  The option pays out if $S_t > K$.  So the value of this option at time 0 is the probability that it pays out at time $t$, discounted by the risk free interest rate $r$, and multiplied by the expected value of $S_t - K$ given that $S_t > K$.  In other words,

$\begin{array}{rll} \displaystyle C_0 &=& e^{-rt}Pr\left(S_t>K\right)E\left[S_t-K|S_t>K\right] \\ \\ &=& e^{-rt}\mathcal{N}(d_2)\left(E\left[S_t|S_t>K\right] - E\left[K|S_t>K\right]\right) \\ \\ &=& e^{-rt}\mathcal{N}(d_2)\left(S_0e^{(\alpha - \delta)t}\frac{\mathcal{N}(d_1)}{\mathcal{N}(d_2)} - K\right) \end{array}$

Black-Scholes makes the additional assumption that all investors are risk neutral.  This means assets do not pay a risk premium for being more risky.  Long story short, $\alpha - r = 0$ so $\alpha = r$.  So in the Black-Scholes formula:

$\begin{array}{rll} \displaystyle d_1 &=& -\frac{\ln{\frac{K}{S_0}} - \left(r - \delta + \frac{1}{2}\sigma^2\right)t}{\sigma\sqrt{t}} \\ \\ \displaystyle d_2 &=& -\frac{\ln{\frac{K}{S_0}}- \left(r- \delta - \frac{1}{2}\sigma^2\right)t}{\sigma\sqrt{t}} \end{array}$

Continuing our derivation of $C_0$ but replacing $\alpha$ with $r$,

$\begin{array}{rll} \displaystyle C_0 &=& e^{-rt}\mathcal{N}(d_2)\left(S_0e^{(r - \delta)t}\frac{\mathcal{N}(d_1)}{\mathcal{N}(d_2)} - K\right) \\ \\ &=& S_0e^{-\delta t}\mathcal{N}(d_1) - Ke^{-rt}\mathcal{N}(d_2)\end{array}$

For a put option $P_0$ with payout $K-S_t$ for $K>S_t$ and 0 otherwise,

$P_0 = Ke^{-rt}\mathcal{N}(-d_2) - S_0e^{-\delta t}\mathcal{N}(-d_1)$

These are the famous Black-Scholes formulas for option pricing.  When derived on the back of a cocktail napkin, they are indispensable for impressing the ladies at your local bar.  :p

## Parametric Distributions

Parametric distributions are functions in several dimensions.  Various parametric distributions are given in the exam tables.  Each input variable or dimension of the distribution function is called a parameter.  While studying, it is important to keep in mind that parameters are simply abstract devices built into a distribution function which allow us, through their manipulation, to tweak the shape of the distribution.  Ultimately, we are still only interested in things like $Pr(X\le x)$ and the distribution function parameters are used to help describe the distribution of $X$.

Transformations

1. Scaling:  If a random variable $X$ has a scaleable parametric distribution with parameters $(a_1, a_2, ..., a_n, \theta)$, then one of these parameters can be called a scale parameter and is denoted by $\theta$.  Having the scaleable property implies that $cX$ can be described with the same distribution function as $X$, except that the parameters of its distribution are $(a_1,a_2,..., a_n,c\theta)$ where $c$ is the scale factor.  In terms of probability, scaling a random variable has the following effect– if $Y = cX$ with $c >0$, then $Pr(Y \le y) = Pr(cX\le y) = Pr(X \le \frac{y}{c})$.
Caveat: The Inverse Gaussian as given in the exam tables has a $\theta$ in its set of parameters; however, this is not a scale distribution.  To scale a Lognormal distribution, adjust the parameters to $(\mu + \ln{c}, \sigma)$ where $c$ is the scale factor and $\mu$ and $\sigma$ are the usual parameters.  All the rest of the distributions given in appendix A are scale distributions.
2. Raising to a power:  A random variable raised to a positive power is called transformed.  If it is raised to -1 it is called inverse. If it is raised to a power less than -1, it is called inverse transformed.  When raising to a power, the scale parameter needs to be readjusted to remain a scale parameter in the new distribution.
3. Exponentiating:  An example is the lognormal distribution.  If $X$ is normal, then $Y = e^X$ is lognormal.  In terms of probability, $F_Y(y) = F_X(\ln{y})$.
Splicing
You can create a new distribution function by defining different distribution probability densities on different domain intervals.  As long as the piecewise integral of the spliced distribution is 1, it is a valid distribution.  Since total probability has to be exactly 1, scaling is an important tool that allows us to do this.
Tail Weight
Since a density function must integrate to 1, it must tend to 0 at the extremities of its domain.  If density function A tends towards zero at a slower rate than density function B, then density A is said to have a heavier tail than density B.  Some important measures of tail weight:
1. Tail weight decreases inversely with respect to the number of positive raw or central moments that exist.
2. The limit of the ratio of one density or survival function over another may tend to zero or infinity depending on which has the greater tail weight.
3. An increasing hazard rate function implies a lighter tail and vice versa.
4. An increasing mean residual life function means a heavier tail and vice versa.