Conditional Variance

If X is a random variable that depends on another random variable I, then

Var(X) = E_I[Var_X(X|I)] + Var_I(E_X[X|I])

This is called the double expectation formula.  It is important to keep track of which random variable in a problem is X and which one is I.  Wieshaus calls I the indicator variable.  In the above equation, Var_X(X|I) and E_X[X|I] are functions of I

Example 1:  Noemi and Harry work at Starbucks.  Noemi’s tip jar contains 30% dollars, 30% quarters, 20% dimes, 10% nickels and 10% pennies.  Harry’s tip jar contains 5% dollars, 10% quarters, 10% dimes, 35% nickels and 40% pennies.  A customer steals a coin from Harry’s jar with 99% probability and from Noemi’s jar with 1% probability.  What is the variance of the stolen amount?

  1. Identify the random variables.
    • The stolen amount is what we’re interested in so this is X.
    • The distribution of X depends on which jar the coin came from so the choice of jar is the indicator variable I.
  2. Find the distribution of E_X
    • E_X[X|I=H] = 0.1065 with 99% probability.
    • E_X[X|I=N] = 0.4010 with 1% probability.
  3. Var_I(E_X[X|I]) = 0.000858629
  4. Find the distribution of Var_X(X|I)
    • Var_X(X|I=H) = 0.04682275 with 99% probability.
    • Var_X(X|I=N) = 0.16020900 with 1% probability.
  5. E_I[Var_X(X|I)] = 0.04795661
  6. Var(X) = 0.000858629 + 0.04795661 = 0.0488152


Filed under Probability

8 responses to “Conditional Variance

  1. Ivan

    May i ask you a question?

    If i know the conditional probabilities of the variances

    Var(Y|X=x1),Var(Y|X=x2),Var(Y|X=x3) ,
    where X=x1+x2+x3
    How i can find the total Variance V(Y)=? . Using the conditional variance i should multiply each one of them with ??? and add them?
    I applied the law of total probability but it did not work?

    Thank You

  2. uclatommy

    You need to use the double expectation formula for variance. To get E[Var(Y|X)], you calculate Var(Y|X=x_1)Pr(X=x_1) + Var(Y|X=x_2)Pr(X=x_2) + Var(Y|X=x_3)Pr(X=x_3). Then you need to find Var(E[Y|X]). As you may know, Var(A) = E[A^2] - E[A]^2. So to calculate Var(E[Y|X]), you treat E[Y|X] like any other random variable. This is sometimes hard to grasp but once you get used to it, it’s easy. E[Y|X] is a variable that has the following possible values: E[Y|X=x_1], E[Y|X=x_2], E[Y|X=x_3].

  3. Ivan

    Hi Thomas,

    THANK YOU FOR YOUR ANSWER. I get only the first step that you described. The problem which I want to calculate is pretty similar to your example. I clculated the information that you have in point 2 and 4 in your example. After your explanation I know how to compute also point 5, namely E[Var(X|Y)]. So, if I use your example I cannot derive point 3 Var[E(X|I)]=0.00858629.
    I tried like this:
    1. E(X)=o.1065×0.99+0.4010×0.01=0.109445
    2. Var[E(X|I)]=o.99x(0.1065-0.109445)^2+
    0.01x(0.4010-0.109445)^2=0.000858629 ( so I have one more zero after the point than you)?

    However, If I derive the result in point 3 and I have all answers form 1 to 5 ( including), how can i derive the final point 6 answer-Var(X) ? What is the last step?

    Thank You Again!

  4. uclatommy

    You are right. I am missing a zero in the answer to step 3. I’ve updated the post. Thanks for the correction.

  5. Ivan

    Can you help me to solve this problem if you have time
    X is normally distributed E(X)=10 Var(X)=9
    P(|X-1|greater or equal to 2)=?
    Is it P(|X-1|greater or equal to 2)=[P(X-1)smaller or equal to 2)]-[P(X-1) smaller or equal to-2)]=[P(X)smaller or equal to 3)]-[P(X smaller or equal to -1)]= from here on it is clear, but is it ok till here? And what about:
    P(|X-1|smaller or equal to 2)=?

  6. ivan

    This problem i cannot solve also, sorry for bothering you but i have a retake in statistics II and i am hopeless 😦
    “A zoological garden owns 12 animals of an almost extinct species. In a research centre, a new dangerous disease had been discovered. Now the director of the zoo wants to find out, how many of her animals suffer from that desease. Unfortunately, the catching and examining of the animals is quite complicated and expensive. But 4 of the animals were examined, and exactly 1 of them was actually diseased.
    State appropriate model assumptions, calculate maximum-likelihood estimates of
    a) the total number of deseased animals in that population,
    b) the probability that a (randomly chosen) animal suffers from that desease.”
    => it is hypergeometric distr-n, but how to find the unknown parameter?

  7. Sabrina

    Hi Thomas,

    I realize this is an older post, but it has really helped me, so thank you for posting this explanation. I was wondering how you computed that Variance in step 3?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s