1. Education

Probability of the Union of Three or More Sets

By

When two events are mutually exclusive, the probability of their union can be calculated with the addition rule. We know that for rolling a die, the rolling a number greater than four or a number less than three are mutually exclusive events, with nothing in common. So to find the probability of this event, we simply add the probability that we roll a number greater than four to the probability that we roll a number less than three. In symbols we have the following, where the capital P denotes “probability of”:

P(greater than four or less than three) = P(greater than four) + P(less than three) = 2/6 + 2/6 = 4/6.

If the events are not mutually exclusive, then we do not simply add the probabilities of the events together, but we need to subtract the probability of the intersection of the events. Given the events A and B:

P(A U B) = P(A) + P(B) - P(AB).

Here we account for the possibility of double counting those elements that are in both A and B, and that is why we subtract the probability of the intersection.

The question that arises from this is “Why stop with two sets? What is the probability of the union of more than two sets?”

Formula for Union of Three Sets

We will extend the above ideas to the situation where we have three sets, which we will denote A, B and C. We will not assume anything more than this, so there is the possibility that the sets have nonempty intersection. The goal will be to calculate the probability of the union of these three sets, or P (A U B U C).

The above discussion for two sets still holds. We can add together the probabilities of the individual sets A, B and C, but in doing this we have double counted some elements.

The elements in the intersection of A and B have been double counted as before, but now there are other elements that have potentially been counted twice. The elements in the intersection of A and C and in the intersection of B and C have now also been counted twice. So the probabilities of these intersections must also be subtracted.

But have we subtracted too much? There is something new to consider that we did not have to be concerned about when there were only two sets. Just as any two sets can have an intersection, all three sets can also have an intersection. In trying to make sure that we did not double count anything, we have not counted at all those elements that show up in all three sets. So the probability of the intersection of all three sets must be added back in.

Here is the formula that is derived from the above discussion:

P (A U B U C) = P(A) + P(B) + P(C) - P(AB) - P(AC) - P(BC) + P(ABC)

Example

To see the formula for the probability of the union of three sets, suppose we are playing a board game that involves rolling two dice. Due to the rules of the game, we need to get at least one of the dice to be a two, three or four in order to win. What is the probability of this? We note that we are trying to calculate the probability of the union of three events: rolling at least one two, rolling at least one three, rolling at least one four. So we can use the above formula with the following probabilities:

  • The probability of rolling a two is 11/36. The numerator here comes from the fact that there are six outcomes in which the first die is a two, six in which the second die is a two, and one outcome where both dice are twos. This gives us 6 + 6 - 1 = 11.
  • The probability of rolling a three is 11/36, for the same reason as above.
  • The probability of rolling a four is 11/36, for the same reason as above.
  • The probability of rolling a two and a three is 2/36. Here we can simply list the possibilities, the two could come first or it could come second.
  • The probability of rolling a two and a four is 2/36, for the same reason that probability of a two and a three is 2/36.
  • The probability of rolling a two, three and a four is 0, because we are only rolling two dice and there is no way to get three numbers with two dice.
We now use the formula and see that the probability of getting at least a two, a three or a four is

11/36 + 11/36 + 11/36 – 2/36 – 2/36 – 2/36 + 0 = 27/36.

Formula for Probability of Union of Four Sets

The reason for why the formula for the probability of the union of four sets has its form is similar to the reasoning for the formula for three sets. As the number of sets increase, the number of pairs, triples and so on increase as well. With four sets there are six pairwise intersections that must be subtracted, four triple intersections to add back in, and now a quadruple intersection that needs to be subtracted. Given four sets A, B, C and D, the formula for the union of these sets is as follows:

P (A U B U C U D) = P(A) + P(B) + P(C) +P(D) - P(AB) - P(AC) - P(AD)- P(BC) - P(BD) - P(CD) + P(ABC) + P(ABD) + P(ACD) + P(BCD) - P(ABCD).

Overall Pattern

We could write formulas (that would look even scarier than the one above) for the probability of the union of more than four sets, but from studying the above formulas we should notice some patterns. These patterns hold to calculate unions of more than four sets. The probability of the union of any number of sets can be found as follows:

  1. Add the probabilities of the individual events.
  2. Subtract the probabilities of the intersections of every pair of events.
  3. Add the probabilities of the intersection of every set of three events.
  4. Subtract the probabilities of the intersection of every set of four events.
  5. Continue this process until the last probability is the probability of the intersection of the total number of sets that we started with.

  1. About.com
  2. Education
  3. Statistics
  4. Statistics Formulas
  5. Probability of Union of Three or More Sets

©2014 About.com. All rights reserved.