One strategy in mathematics is to start with a few statements, then build up more mathematics from these statements. The beginning statements are known as axioms. An axiom is typically something that is mathematically self evident. From a relatively short list of axioms deductive logic is used to prove other statements, called theorems or propositions. The area of mathematics known as probability is no different. Underlying probability is a handful of axioms from which we can derive all sorts of results. But what are these probability axioms?
Probability can be reduced to three axioms. This was first done by the mathematician Andrei Kolmogorov. It presupposes that we have a set of outcomes called the sample space S comprised of subsets called events E1, E2, . . ., En and a way of assigning a probability to any event E. The probability of the event E is denoted by P(E).
The first axiom of probability is that the probability of any event is a nonnegative real number. This means that the smallest that a probability can ever be is zero, and that it cannot be infinite. The set of numbers that we may use are real numbers. This refers to both rational numbers, also known as fractions, and irrational numbers that cannot be written as fractions.
One thing to note is that this axiom says nothing about how large the probability of an event can be. The axiom does eliminate the possibility of negative probabilities. It reflects the notion that smallest probability, reserved for impossible events, is zero.
The second axiom of probability is that the probability of the entire sample space is one. Symbolically we write P(S) = 1. Implicit in this axiom is the notion that the sample space is everything possible for our probability experiment and that there are no events outside of the sample space.
By itself this axiom does not set an upper limit on the probabilities of events that are not the entire sample space. It does reflect that something with absolute certainty has probability of 100%.
The third axiom of probability deals with mutually exclusive events. If E1 and E2 are mutually exclusive, meaning that they have an empty intersection and we use U to denote the union, then P(E1 U E2 ) = P(E1) + P(E2).
The axiom actually covers the situation with several (even countably infinite) events, every pair of which are mutually exclusive. As long as this occurs, the probability of the union of the events is the same as the sum of the probabilities:
P(E1 U E2 U . . . U En ) = P(E1) + P(E2) + . . . + En
Although this third axiom might not appear that useful, we will see that combined with the other two axioms it is quite powerful indeed.
The three axioms set an upper bound for the probability of any event. We denote the complement of the event E by EC. From set theory E and EC have empty intersection and are mutually exclusive. Furthermore E U EC = S, the entire sample space.
These facts, combined with the axioms give us:
1 = P(S) = P(E U EC) = P(E) + P(EC) .
We rearrange the above equation and see that P(E) = 1 - P(EC). Since we know that probabilities must be nonnegative, we now have that an upper bound for the probability of any event is 1.
By rearranging the formula again we have P(EC) = 1 - P(E). We also can deduce from this formula that the probability of an event not occurring is one minus the probability that it does occur.
The above equation also provides us a way to calculate the probability of the impossible event, denoted by the empty set. To see this, recall that the empty set is the complement of the universal set, in this case SC. Since 1 = P(S) + P(SC) = 1 + P(SC), by algebra we have P(SC) = 0.
The above are just a couple of examples of properties that can be proved directly from the axioms. There are many more results in probability. But all of these theorems are logical extensions from the three axioms of probability.