# Conditional Variance

If $X$ is a random variable that depends on another random variable $I$, then

$Var(X) = E_I[Var_X(X|I)] + Var_I(E_X[X|I])$

This is called the double expectation formula.  It is important to keep track of which random variable in a problem is $X$ and which one is $I$.  Wieshaus calls $I$ the indicator variable.  In the above equation, $Var_X(X|I)$ and $E_X[X|I]$ are functions of $I$

Example 1:  Noemi and Harry work at Starbucks.  Noemi’s tip jar contains 30% dollars, 30% quarters, 20% dimes, 10% nickels and 10% pennies.  Harry’s tip jar contains 5% dollars, 10% quarters, 10% dimes, 35% nickels and 40% pennies.  A customer steals a coin from Harry’s jar with 99% probability and from Noemi’s jar with 1% probability.  What is the variance of the stolen amount?

1. Identify the random variables.
• The stolen amount is what we’re interested in so this is $X$.
• The distribution of $X$ depends on which jar the coin came from so the choice of jar is the indicator variable $I$.
2. Find the distribution of $E_X$
• $E_X[X|I=H] = 0.1065$ with 99% probability.
• $E_X[X|I=N] = 0.4010$ with 1% probability.
3. $Var_I(E_X[X|I]) = 0.000858629$
4. Find the distribution of $Var_X(X|I)$
• $Var_X(X|I=H) = 0.04682275$ with 99% probability.
• $Var_X(X|I=N) = 0.16020900$ with 1% probability.
5. $E_I[Var_X(X|I)] = 0.04795661$
6. $Var(X) = 0.000858629 + 0.04795661 = 0.0488152$

Filed under Probability

### 8 responses to “Conditional Variance”

1. Ivan

May i ask you a question?

If i know the conditional probabilities of the variances

Var(Y|X=x1),Var(Y|X=x2),Var(Y|X=x3) ,
where X=x1+x2+x3
How i can find the total Variance V(Y)=? . Using the conditional variance i should multiply each one of them with ??? and add them?
I applied the law of total probability but it did not work?

Thank You

2. uclatommy

Ivan,
You need to use the double expectation formula for variance. To get $E[Var(Y|X)]$, you calculate $Var(Y|X=x_1)Pr(X=x_1) + Var(Y|X=x_2)Pr(X=x_2) + Var(Y|X=x_3)Pr(X=x_3)$. Then you need to find $Var(E[Y|X])$. As you may know, $Var(A) = E[A^2] - E[A]^2$. So to calculate $Var(E[Y|X])$, you treat $E[Y|X]$ like any other random variable. This is sometimes hard to grasp but once you get used to it, it’s easy. $E[Y|X]$ is a variable that has the following possible values: $E[Y|X=x_1]$, $E[Y|X=x_2]$, $E[Y|X=x_3]$.

3. Ivan

Hi Thomas,

THANK YOU FOR YOUR ANSWER. I get only the first step that you described. The problem which I want to calculate is pretty similar to your example. I clculated the information that you have in point 2 and 4 in your example. After your explanation I know how to compute also point 5, namely E[Var(X|Y)]. So, if I use your example I cannot derive point 3 Var[E(X|I)]=0.00858629.
I tried like this:
1. E(X)=o.1065×0.99+0.4010×0.01=0.109445
2. Var[E(X|I)]=o.99x(0.1065-0.109445)^2+
0.01x(0.4010-0.109445)^2=0.000858629 ( so I have one more zero after the point than you)?

However, If I derive the result in point 3 and I have all answers form 1 to 5 ( including), how can i derive the final point 6 answer-Var(X) ? What is the last step?

Thank You Again!

4. uclatommy

Ivan,
You are right. I am missing a zero in the answer to step 3. I’ve updated the post. Thanks for the correction.

5. Ivan

Can you help me to solve this problem if you have time
X is normally distributed E(X)=10 Var(X)=9
P(|X-1|greater or equal to 2)=?
Is it P(|X-1|greater or equal to 2)=[P(X-1)smaller or equal to 2)]-[P(X-1) smaller or equal to-2)]=[P(X)smaller or equal to 3)]-[P(X smaller or equal to -1)]= from here on it is clear, but is it ok till here? And what about:
P(|X-1|smaller or equal to 2)=?

6. ivan

This problem i cannot solve also, sorry for bothering you but i have a retake in statistics II and i am hopeless 😦
“A zoological garden owns 12 animals of an almost extinct species. In a research centre, a new dangerous disease had been discovered. Now the director of the zoo wants to find out, how many of her animals suffer from that desease. Unfortunately, the catching and examining of the animals is quite complicated and expensive. But 4 of the animals were examined, and exactly 1 of them was actually diseased.
State appropriate model assumptions, calculate maximum-likelihood estimates of
a) the total number of deseased animals in that population,
b) the probability that a (randomly chosen) animal suffers from that desease.”
=> it is hypergeometric distr-n, but how to find the unknown parameter?

7. Sabrina

Hi Thomas,

I realize this is an older post, but it has really helped me, so thank you for posting this explanation. I was wondering how you computed that Variance in step 3?

• uclatommy

$Pr\left(E_X = 0.1065\right) = 0.99$ and $Pr\left(E_X = 0.4010\right) = 0.01$
What is the variance of $E_X$?