# Tag Archives: Memorize This

## Recursive Discrete Aggregate Loss

You have 2 six-sided dice.  You roll one dice to determine the number of times you will roll the second dice.  The sum of the results of each roll of the second dice is the amount of aggregate loss.  Since the frequency and severity are discrete, for any aggregate loss amount, the number of combinations of rolls to produce such an amount is clearly countable and finite.  For example, an aggregate loss amount of 3 can be arrived at by rolling a 1 on the first dice, then rolling a 3; or rolling a 2, then rolling the combinations (1,2),(2,1); or rolling a 3 and then rolling (1,1,1) on the second dice.  The probability of experiencing an aggregate loss of 3 is:

$\begin{array}{rll} \Pr(S=3) \displaystyle &=& \frac{1}{6^2} + \frac{2}{6^3} + \frac{1}{6^4} \\ \\ \displaystyle &=& \frac{49}{6^4} \end{array}$

This method of calculating the probability is called the convolution method.  Now imagine the frequency and severity distributions are discrete but infinite.  To calculate $\Pr(S=10)$ would require calculating the probability for many possible combinations.  If the discrete functions are from the (a,b,0) class, there is a recursive formula that can calculate this.  It is given by:

$g_k = \displaystyle \frac{1}{1-af_0}\sum_{j=1}^k \left(a+\frac{bj}{k}\right)f_jg_{k-j}$

where $k$ is an integer, $g_k = \Pr(S=n)=f_S(n)$, $f_n = \Pr(X=n)$, and $p_n = \Pr(N=n)$.  This is called the recursive method.  To start the recursion, you need to find $g_0$.  You can then find any $g_k$.  If a problem asks for $F_S(3)$, this is equal to $g_0+g_1+g_2+g_3$.  You iterate through the recursion to find each $g_k$ then add them together.

## Approximating Aggregate Losses

An aggregate loss $S$ is the sum of all losses in a certain period of time.  There are an unknown number $N$ of losses that may occur and each loss is an unknown amount $X$.  $N$ is called the frequency random variable and $X$ is called the severity.  This situation can be modeled using a compound distribution of $N$ and $X$.  The model is specified by:

$\displaystyle S = \sum_{n=1}^N X_n$

where $N$ is the random variable for frequency and the $X_n$‘s are IID random variables for severity.  This type of structure is called a collective risk model.

An alternative way to model aggregate loss is to model each risk using a different distribution appropriate to that risk.  For example, in a portfolio of risks, one may be modeled using a pareto distribution and another may be modeled with an exponential distribution.  The expected aggregate loss would be the sum of the individual expected losses.  This is called an individual risk model and is given by:

$\displaystyle S = \sum_{i=1}^n X_i$

where $n$ is the number of individual risks in the portfolio and the $X_i$‘s are random variables for the individual losses.  The $X_i$‘s are NOT IID, and $n$ is known.

Both of these models are tested in the exam; however, the individual risk model is usually tested in combination with the collective risk model.  An example of a problem structure that combines the two is given below.

Example 1: Your company sells car insurance policies.  The in-force policies are categorized into high-risk and low-risk groups.  In the high-risk group, the number of claims in a year is poisson with a mean of 30.  The number of claims for the low-risk group is poisson with a mean of 10.  The amount of each claim is pareto distributed with $\theta = 200$ and $\alpha = 2$.
Analysis: Being able to see the structure of the problem is a very important first step in being able to solve it.  In this situation, you would model the aggregate loss as an individual risk model.  There are 2 individual risks– high and low risk.  For each group, you would model the aggregate loss using a collective risk model.  For the high-risk, the frequency is poisson with mean 30 and the severity is pareto with $\theta = 200$ and $\alpha = 2$.  For the low-risk group, the frequency is poisson with mean 10 and the severity is pareto with the same parameters.

For these problems, you will need to know how to:

1. Find the expected aggregate loss.
2. Find the variance of aggregate loss.
3. Approximate the probability that the aggregate loss will be above or below a certain amount using a normal distribution.
Example: what is the probability that aggregate losses are below $5,000? 4. Determine how many risks would need to be in a portfolio for the probability of aggregate loss to reach a given level of certainty for a given amount. Example: how many policies should you underwrite so that the aggregate loss is less than the expected aggregate loss with a 95% degree of certainty? 5. Determine how long your risk exposure should be for the probability of aggregate loss to reach a given level of certainty for a given amount. Problems that require you to determine probabilities for the aggregate loss will usually state that you should use a normal approximation. This will require the calculation of the expected aggregate loss and the variance of the aggregate loss. MEMORIZE Expected aggregate loss for a collective risk model is given by: $E[S] = E[N]E[X]$ For the individual risk model, it is $\displaystyle E[S] = \sum_{i=1}^n E[X_i]$ Variances under the collective risk model are conditional variances. $Var(S) = E[Var(X|I)] + Var(E[X|I])$ When frequency and severity are independent, the following shortcut is valid and is called a compound variance: $Var(S) = E[N]Var(X) + Var(N)E[X]^2$ Variance under the individual risk model is additive: $\displaystyle Var(S) = \sum_{i=1}^n Var(X)$ Example 2: Continuing from Example 1, calculate the mean and variance of the aggregate loss. Assume frequency and severity are independent. Answer: This is done by 1. Calculating the expected aggregate loss and variance in the high-risk group. 2. Calculating the expected aggregate loss and variance in the low-risk group. 3. Adding the expected values from both groups to get the total expected aggregate loss. 4. Adding the variances from both groups to get the total variance. I will use subscript $H$ and $L$ to denote high and low risk groups respectively. $E[S_H] = E[N_H]E[X_H] = 30\times 200 = 6,000$ $\begin{array}{rll} Var(S_H) &=& E[N_H]Var(X_H) + Var(N_H)E[X_H]^2 \\ &=& 30 \times 40,000 + 30 \times 200^2 \\ &=& 2,400,000 \end{array}$ $E[S_L] = E[N_L]E[X_L] = 10 \times 200 = 2,000$ $\begin{array}{rll} Var(S_H) &=& 10 \times 40,000 + 10 \times 200^2 \\ &=& 800,000 \end{array}$ Add expected values to get $E[S] = 6,000 + 2,000 = 8,000$ Add variances to get $Var(S) = 2,400,000 + 800,000 = 3,200,000$ Once the mean and variance of the aggregate loss has been calculated, you can use them to approximate probabilities for aggregate losses using a normal distribution. Example 3: Continuing from Example 2, use a normal approximation for aggregate loss to calculate the probability that losses exceed$12,000.
Answer:  To solve this, you will need to calculate a $z$ value for the normal distribution using the expected value and variance found in Example 2.

$\begin{array}{rll} \Pr(S > 12,000) &=& 1- \Pr(S< 12,000) \\ \\ &=& \displaystyle 1-\Phi\left(\frac{12,000 - 8,000}{\sqrt{3,200,000}}\right) \\ \\ &=& 1 - \Phi(2.24) \\ \\ &=& 0.0125 \end{array}$

CONTINUITY CORRECTION
Suppose in the above examples the severity $X$ is discrete.  For example, $X$ is poisson.  Under this specification, we need to add 0.5 to 12,000 in the calculation for $\Pr(S > 12,000)$.  So we would instead calculate $\Pr(S > 12,000.5)$  This is called a continuity correction and occurs when we have a discrete severity random variable.  If we were interested in $\Pr(S<12,000)$, we would subtract 0.5 instead.  This has a greater effect when the domain of possible values is smaller.

Another type of problem I’ve encountered in the samples is constructed as follows:

Example 4: You drive a 1992 Honda Prelude Si piece-of-crap-mobile (no, that’s my old car and you are driving it because I sold it to you to buy my Mercedes).  The failure rate per year is poisson with mean 2.  The average cost of repair for each instance of breakdown is $500 with a standard deviation of$1000.  How many years do you have to continue driving the car so that the probability of the total maintenance cost exceeding 120% of the expected total maintenance cost is less than 10%?  (Assume the car is so crappy that it cannot deteriorate any further so the failure rates and average repair costs remain constant every year.)

$E[S_1] = 1,000$

$\begin{array}{rll} Var(S_1) &=& 2 \times 1,000^2 + 2 \times 500^2 \\ &=& 2,500,000 \end{array}$

For $n$ years, we have

$E[S] = 1,000n$

$Var(S) = 2,500,000n$

According to the problem, we are interested in $S$ such that $\Pr(S > 1,200n) = 0.1$.  Under normal approximation, this implies

$\begin{array}{rll} \Pr(S>1,200n) &=& 1-\Pr(S<1,200n) \\ \\ &=& \displaystyle 1- \Phi\left(\frac{1,200n - 1,000n}{\sqrt{2,500,000n}}\right) \end{array}$

Which implies

$\displaystyle \Phi\left(\frac{200n}{\sqrt{2,500,000n}}\right) = 0.9$

The probability $0.9$ corresponds to a $z$ value of 1.28.  This implies

$\displaystyle \frac{200n}{\sqrt{2,500,000n}} = 1.28$

Solving for $n$ we have $n = 1024$ years.  LOL!

## The Loss Elimination Ratio

If you impose a deductible $d$ on an insurance policy that you’ve written, what fraction of expected losses do you eliminate from your expected liability?  This is measured by the Loss Elimination Ratio $LER(d)$.

$\displaystyle LER(d) = \frac{E\left[X \wedge d\right]}{E\left[X\right]}$

Definitions:

1. Ordinary deductible $d$— The payment made by the writer of the policy is the loss $X$ minus the deductible $d$.  If the loss is less than $d$, then nothing is paid.
2. Franchise deductible $d_f$—  The payment made by the writer of the policy is the complete amount of the loss $X$ if $X$ is greater than $d_f$.
A common type of question considers what happens to LER if an inflation rate $r$ increases the amount of all losses, but the deductible remains unadjusted.  Let $X$ be the loss variable.  Then $Y=(1+r)X$ is the inflation adjusted loss variable.  If losses $Y$ are subject to deductible $d$, then
$\begin{array}{rll} \displaystyle LER_Y(d) &=& \frac{E\left[(1+r)X\wedge d\right]}{E\left[(1+r)X\right]} \\ \\ \displaystyle &=&\frac{(1+r)E\left[X\wedge \frac{d}{1+r}\right]}{(1+r)E\left[X\right]} \\ \\ &=& \frac{E\left[X \wedge \frac{d}{1+r}\right]}{E\left[X\right]}\end{array}$
Memorize:
$\displaystyle E\left[X \wedge d\right] = \int_0^d{x f(x) dx} + d\left(1-F(x)\right)$

## The Bernoulli Shortcut

If $X$ has a Standard Bernoulli Distribution, then it can only have values 0 or 1 with probabilities $q$ and $1-q$.  Any random variables that can only have 2 values is a scaled and translated version of the standard bernoulli distribution.

Expected Value and Variance:

For a standard bernoulli distribution, $E[X] = q$ and $Var(X) = q(1-q)$.  If $Y$ is a random variable that can only have values $a$ and $b$ with probabilities $q$ and $(1-q)$ respectively, then

$\begin{array}{rl} Y &= (a-b)X +b \\ E[Y] &= (a-b)E[X] +b \\ Var(Y) &= (a-b)^2Var(X) \\ &= (a-b)^2q(1-q) \end{array}$

Filed under Probability

## Normal Approximation

If a random variable $Y$ is normal, you can map it to a standard normal distribution $X$ (useful for finding probabilities in the standard normal table) by the following relationship:

$Y = \mu_y + \sigma_yX$

Example 1:  $Y$ is normal.  $E[Y] = 100$ and $Var(Y) = 49$  Then

$\begin{array}{rl} P(Y \leq 111.515) &= P(X \leq \frac{111.515 - 100}{\sqrt{49}}) \\ &= P(X \leq 1.645) \\ &= 0.95 \end{array}$

Example 2:  $Y$ has the same distribution as example 1.  Then $P(Y \leq y) = 0.9$ implies

$P(X \leq \frac{y - 100}{\sqrt{49}}) = 0.9$

Which implies:

$\frac{y - 100}{\sqrt{49}} = 0.8159$

Hence $y = 105.7113$.

With regard to Central Limit Theorem:

By the Central Limit Theorem, the distribution of a sum of iid random variables converges to a normal distribution as the number of iid random variables increases.  This means that if the number of iid random variables is sufficiently large, we can get approximate probabilities by using a normal distribution approximation.

## Conditional Probability and Expectation

Conditional probability:

$\Pr(X\mid Y) = \displaystyle \frac{\Pr(X \cap Y)}{\Pr(Y)}$

Bayes Theorem:

$\Pr(A\mid B) = \displaystyle \frac{\Pr(B \mid A)\Pr(A)}{\Pr(B)}$

for continuous distributions:

$f_X(x\mid y) = \displaystyle \frac{f_Y(y \mid x)f_X(x)}{f_Y(y)}$

Recall for a joint distribution function $f(x,y)$,

$f_X(x) = \displaystyle \int_{-\infty}^\infty {f(x,y)dy}$

Law of Total Probability:  Suppose $\displaystyle \sum_{i=1}^n B_i = 1$ and $\Pr(B_i \cap B_j) = 0$ for $i \ne j$, then for any event $A$,

$\begin{array}{rl} \Pr(A) &= \displaystyle \sum_{i=1}^n \Pr(A \cap B_i) \\ &= \displaystyle \sum_{i=1}^n \Pr(B_i)\Pr(A\mid B_i) \end{array}$

In many cases, you will need to use the law of total probability in conjunction with Bayes Theorem to find $P(A)$ or $P(B)$.

For a continuous distribution:

$\Pr(A) = \displaystyle \int\Pr(A\mid x)f(x)dx$

Conditional Mean:

$E_X[X] = E_Y[E_X[X\mid Y]]$

## Functions and Moments

Some distribution functions:

Survival function

$\displaystyle S(x) = 1-F(x) = \Pr(X>x)$

where $F(x)$ is a cumulative distribution function.

Hazard rate function

$\displaystyle h(x) = \frac{f(x)}{S(x)} = -\frac{d\ln{S(x)}}{dx}$

where $f(x)$ is a probability density function.

Cumulative hazard rate function

$\displaystyle H(x) =\int_{-\infty}^x{h(t)dt} = -\ln{S(x)}$

The following relationship is often useful:

$S(x) = \displaystyle e^{-\int_{-\infty}^x{h(t)dt}}$

Expected Value:

$\displaystyle E[X] = \int_{-\infty}^\infty{xf(x)dx}$

Or more generally,

$\displaystyle E[g(X)] = \int_{-\infty}^\infty{g(x)f(x)dx}$

When $g(X) = X^n$, the expected value of such a function is called the nth raw moment and is denoted by $\mu'_n$.  Let $\mu$ be the first raw moment.  That is, $\mu = E[X]$.  $E[(X-\mu)^n]$ is called an nth central moment.

Moments are used to generate some statistical measures.

Variance $\sigma^2$

$\displaystyle Var(X) = E[(X-\mu)^2] = E(X^2) - E(X)^2$

The coefficient of variation is $\displaystyle \frac{\mu}{\sigma}$.

Skewness $\gamma_1$

$\displaystyle \gamma_1 = \frac{\mu_3}{\sigma^3}$

Kurtosis $\gamma_2$

$\displaystyle \gamma_2 = \frac{\mu_4}{\sigma^4}$

Covariance of two distribution functions

$\displaystyle Cov(X,Y) = E[(X-\mu_x)(Y-\mu_Y)] = E[XY] - E[X]E[Y]$

*Note: if $X$ and $Y$ are independent, $Cov(X,Y)=0$

Correlation coefficient $\rho_{XY}$

$\displaystyle \rho_{XY} = \frac{Cov(X,Y)}{\sigma_X\sigma_Y}$

All of the above definitions should be memorized.  Some things that might be tested in the exam are:

• Given a particular distribution function, what happens to skewness or kurtosis in the limit of a certain parameter?
• What is the expected value, variance, skewness, kurtosis of a given distribution function?
• What is the covariance or correlation coefficient of two distribution functions?

Central moments can be calculated using raw moments.  Know how to calculate raw moments using the statistics function on the calculator.  This can be a useful timesaver in the exam.  Using alternating positive and negative binomial coefficients, write an expression for $\mu_n$ with $\mu'$ and $\mu$ as the two binomial terms.

Example:

$\mu_4 = \mu'_4 - 4\mu'_3\mu + 6\mu'_2\mu^2 - 4\mu'_1\mu^3 + \mu^4$

Since $\mu'_1 = \mu$, the two terms on the end simplify to $-3\mu^4$.  The result is

$\mu_4 = \mu'_4 - 4\mu'_3\mu + 6\mu'_2\mu^2 - 3\mu^4$

Moment Generating Function:

If the moment generating function $M(t)$ is known for random variable $X$, it’s nth raw moment can be found by taking the nth derivative of $M(t)$ and evaluating at 0.  Moment generating functions take the form:

$M(t) = \displaystyle E[e^{tx}]$

If $Z = X +Y$, then $M_Z(t) = M_X(t)\cdot M_Y(t)$.