Tag Archives: Lognormal

The Lognormal Distribution

Review: If X is normal with mean \mu and standard deviation \sigma, then

Z = \displaystyle \frac{X-\mu}{\sigma}

is the Standard Normal Distribution with mean 0 and standard deviation 1.  To find the probability Pr(X \le x), you would convert X to the standard normal distribution and look up the values in the standard normal table.

\begin{array}{rll} Pr(X \le x) &=& Pr\left(\displaystyle \frac{X-\mu}{\sigma} \le \frac{x-\mu}{\sigma}\right) \\ \\ &=& \displaystyle Pr\left(Z \le \frac{x-\mu}{\sigma}\right) \\ \\ &=& \displaystyle \mathcal{N}\left(\frac{x-\mu}{\sigma}\right) \end{array}

If V is a weighted sum of n normal random variables X_i, i = 1, ..., n, with means \mu_i, variance \sigma^2_i, and weights w_i, then

\displaystyle E\left[\sum_{i=1}^n w_iX_i\right] = \sum_{i=1}^n w_i\mu_i

and variance

\displaystyle Var\left(\sum_{i=1}^n w_iX_i\right) = \sum_{i=1}^n \sum_{j=1}^n w_iw_j\sigma_{ij}

where \sigma_{ij} is the covariance between X_i and X_j.  Note when i=j, \sigma_{ij} = \sigma_i^2 = \sigma_j^2.

Remember: A sum of random variables is not the same as a mixture distribution!  The expected value is the same, but the variance is not.  A sum of normal random variables is also normal.  So V is normal with the above mean and variance.

Actuary Speak: This is called a stable distribution.  The sum of random variables from the same distribution family produces a random variable that is also from the same distribution family.

The fun stuff:
If X is normal, then Y = e^X is lognormal.  If X has mean \mu and standard deviation \sigma, then

\begin{array}{rll} \displaystyle E\left[Y\right] &=& E\left[e^X\right] \\ \\ \displaystyle &=& e^{\mu + \frac{1}{2}\sigma^2} \\ \\ Var\left(e^X\right) &=& e^{2\mu + \sigma^2}\left(e^{\sigma^2} - 1\right)\end{array}

Recall FV = e^\delta where FV is the future value of an investment growing at a continuously compounded rate of \delta for one period.  If the rate of growth is a normal distributed random variable, then the future value is lognormal.  The Black-Scholes model for option prices assumes stocks appreciate at a continuously compounded rate that is normally distributed.

S_t = S_0e^{R(0,t)}

where S_t is the stock price at time t, S_0 is the current price, and R(0,t) is the random variable for the rate of return from time 0 to t.  Now consider the situation where R(0,t) is the sum of iid normal random variables R(0,h) + R(h,2h) + ... + R((n-1)h,t) each having mean \mu_h and variance \sigma_h^2.  Then

\begin{array}{rll} E\left[R(0,t)\right] &=& n\mu_h \\ Var\left(R(0,t)\right) &=& n\sigma_h^2 \end{array}

If h represents 1 year, this says that the expected return in 10 years is 10 times the one year return and the standard deviation is \sqrt{10} times the annual standard deviation.  This allows us to formulate a function for the mean and standard deviation with respect to time.  Suppose we write

\begin{array}{rll} \displaystyle \mu(t) &=& \left(\alpha - \delta -\frac{1}{2}\sigma^2\right)t \\ \sigma(t) &=& \sigma \sqrt{t} \end{array}

where \alpha is the growth factor and \delta is the continuous rate of dividend payout.  Since all normal random variables are transformations of the standard normal, we can write R(0,t) =\mu(t)+Z\sigma(t) . The model for the stock price becomes

\displaystyle S_t = S_0e^{\left(\alpha - \delta - \frac{1}{2}\sigma^2\right)t + Z\sigma\sqrt{t}}

In this model, the expected value of the stock price at time t is

E\left[S_t\right] = S_0e^{(\alpha - \delta)t}

Actuary Speak: The standard deviation \sigma of the return rate is called the volatility of the stock.  This term comes from expressing the rate of return as an Ito process. \mu(t) is called the drift term and \sigma(t) is called the volatility term.

Confidence intervals: To find the range of stock prices that corresponds to a particular confidence interval, we need only look at the confidence interval on the standard normal distribution then translate that interval into stock prices using the equation for S_t.

Example: For example z=[-1.96, 1.96] represents the 95% confidence interval in the standard normal \mathcal{N}(z).  Suppose t = \frac{1}{3}, \alpha = 0.15, \delta = 0.01, \sigma = 0.3, and S_0 = 40.  Then the 95% confidence interval for S_t is

\left[40e^{(0.15-0.01-\frac{1}{2}0.3^2)\frac{1}{3} + (-1.96)0.3\sqrt{\frac{1}{3}}},40e^{(0.15-0.01-\frac{1}{2}0.3^2)\frac{1}{3} + (1.96)0.3\sqrt{\frac{1}{3}}}\right]

Which corresponds to the price interval of


Probabilities: Probability calculations on stock prices require a bit more mental gymnastics.

\begin{array}{rll} \displaystyle Pr\left(S_t<K\right) &=& Pr\left(\frac{S_t}{S_0} < \frac{K}{S_0}\right) \\ \\ \displaystyle &=& Pr\left(\ln{\frac{S_t}{S_0}} < \ln{\frac{K}{S_0}}\right) \\ \\ \displaystyle &=& Pr\left(Z< \frac{\ln{\frac{K}{S_0}} - \mu(t)}{\sigma(t)}\right) \\ \\ \displaystyle &=& Pr\left(Z<\frac{\ln{\frac{K}{S_0}} - \left(\alpha - \delta - \frac{1}{2}\sigma^2\right)t}{\sigma\sqrt{t}}\right) \end{array}

Conditional Expected Value: Define

\begin{array}{rll} \displaystyle d_1 &=& -\frac{\ln{\frac{K}{S_0}} - \left(\alpha - \delta + \frac{1}{2}\sigma^2\right)t}{\sigma\sqrt{t}} \\ \\ \displaystyle d_2 &=& -\frac{\ln{\frac{K}{S_0}}- \left(\alpha - \delta - \frac{1}{2}\sigma^2\right)t}{\sigma\sqrt{t}} \end{array}


\begin{array}{rll} \displaystyle E\left[S_t|S_t<K\right] &=& S_0e^{(\alpha - \delta)t}\frac{\mathcal{N}(-d_1)}{\mathcal{N}(-d_2)} \\ \\ \displaystyle E\left[S_t|S_t>K\right] &=& S_0e^{(\alpha - \delta)t}\frac{\mathcal{N}(d_1)}{\mathcal{N}(d_2)} \end{array}

This gives the expected stock price at time t given that it is less than K or greater than K respectively.

Black-Scholes formula: A call option C_t on stock S_t has value \max\left(0,S_t - K\right) at time t.  The option pays out if S_t > K.  So the value of this option at time 0 is the probability that it pays out at time t, discounted by the risk free interest rate r, and multiplied by the expected value of S_t - K given that S_t > K.  In other words,

\begin{array}{rll} \displaystyle C_0 &=& e^{-rt}Pr\left(S_t>K\right)E\left[S_t-K|S_t>K\right] \\ \\ &=& e^{-rt}\mathcal{N}(d_2)\left(E\left[S_t|S_t>K\right] - E\left[K|S_t>K\right]\right) \\ \\ &=& e^{-rt}\mathcal{N}(d_2)\left(S_0e^{(\alpha - \delta)t}\frac{\mathcal{N}(d_1)}{\mathcal{N}(d_2)} - K\right) \end{array}

Black-Scholes makes the additional assumption that all investors are risk neutral.  This means assets do not pay a risk premium for being more risky.  Long story short, \alpha - r = 0 so \alpha = r.  So in the Black-Scholes formula:

\begin{array}{rll} \displaystyle d_1 &=& -\frac{\ln{\frac{K}{S_0}} - \left(r - \delta + \frac{1}{2}\sigma^2\right)t}{\sigma\sqrt{t}} \\ \\ \displaystyle d_2 &=& -\frac{\ln{\frac{K}{S_0}}- \left(r- \delta - \frac{1}{2}\sigma^2\right)t}{\sigma\sqrt{t}} \end{array}

Continuing our derivation of C_0 but replacing \alpha with r,

\begin{array}{rll} \displaystyle C_0 &=& e^{-rt}\mathcal{N}(d_2)\left(S_0e^{(r - \delta)t}\frac{\mathcal{N}(d_1)}{\mathcal{N}(d_2)} - K\right) \\ \\ &=& S_0e^{-\delta t}\mathcal{N}(d_1) - Ke^{-rt}\mathcal{N}(d_2)\end{array}

For a put option P_0 with payout K-S_t for K>S_t and 0 otherwise,

P_0 = Ke^{-rt}\mathcal{N}(-d_2) - S_0e^{-\delta t}\mathcal{N}(-d_1)

These are the famous Black-Scholes formulas for option pricing.  When derived on the back of a cocktail napkin, they are indispensable for impressing the ladies at your local bar.  :p

Leave a comment

Filed under Parametric Models, Probability

Parametric Distributions

Parametric distributions are functions in several dimensions.  Various parametric distributions are given in the exam tables.  Each input variable or dimension of the distribution function is called a parameter.  While studying, it is important to keep in mind that parameters are simply abstract devices built into a distribution function which allow us, through their manipulation, to tweak the shape of the distribution.  Ultimately, we are still only interested in things like Pr(X\le x) and the distribution function parameters are used to help describe the distribution of X.


  1. Scaling:  If a random variable X has a scaleable parametric distribution with parameters (a_1, a_2, ..., a_n, \theta), then one of these parameters can be called a scale parameter and is denoted by \theta.  Having the scaleable property implies that cX can be described with the same distribution function as X, except that the parameters of its distribution are (a_1,a_2,..., a_n,c\theta) where c is the scale factor.  In terms of probability, scaling a random variable has the following effect– if Y = cX with c >0, then Pr(Y \le y) = Pr(cX\le y) = Pr(X \le \frac{y}{c}).
    Caveat: The Inverse Gaussian as given in the exam tables has a \theta in its set of parameters; however, this is not a scale distribution.  To scale a Lognormal distribution, adjust the parameters to (\mu + \ln{c}, \sigma) where c is the scale factor and \mu and \sigma are the usual parameters.  All the rest of the distributions given in appendix A are scale distributions.
  2. Raising to a power:  A random variable raised to a positive power is called transformed.  If it is raised to -1 it is called inverse. If it is raised to a power less than -1, it is called inverse transformed.  When raising to a power, the scale parameter needs to be readjusted to remain a scale parameter in the new distribution.
  3. Exponentiating:  An example is the lognormal distribution.  If X is normal, then Y = e^X is lognormal.  In terms of probability, F_Y(y) = F_X(\ln{y}).
You can create a new distribution function by defining different distribution probability densities on different domain intervals.  As long as the piecewise integral of the spliced distribution is 1, it is a valid distribution.  Since total probability has to be exactly 1, scaling is an important tool that allows us to do this.
Tail Weight
Since a density function must integrate to 1, it must tend to 0 at the extremities of its domain.  If density function A tends towards zero at a slower rate than density function B, then density A is said to have a heavier tail than density B.  Some important measures of tail weight:
  1. Tail weight decreases inversely with respect to the number of positive raw or central moments that exist.
  2. The limit of the ratio of one density or survival function over another may tend to zero or infinity depending on which has the greater tail weight.
  3. An increasing hazard rate function implies a lighter tail and vice versa.
  4. An increasing mean residual life function means a heavier tail and vice versa.

Leave a comment

Filed under Parametric Models, Probability