Markov’s inequality is a helpful result in probability that gives information about a probability distribution. The remarkable aspect about it is that the inequality holds for any distribution with positive values, no matter what other features that it has. Markov’s inequality gives an upper bound for the percent of the distribution that is above a particular value.

### Statement of Markov’s Inequality

Markov’s inequality says that for a positive random variable *X* and any positive real number *a*, the probability that *X* is greater than or equal to *a* is less than or equal to the expected value of *X* divided by *a*.

The above description can be stated more succinctly using mathematical notation. In symbols we write Markov’s inequality as:

*P* (*X* ≥ *a*) ≤ *E*( *X*) /*a*

### Illustration of the Inequality

To illustrate the inequality, suppose we have a distribution with nonnegative values (such as a chi-square distribution). If this random variable *X* has expected value of 3 we will look at probabilities for a few values of *a*.

- For
*a*= 10 Markov’s inequality says that*P*(*X*≥ 10) ≤ 3/10 = 30%. So there is a 30% probability that*X*is greater than 10. - For
*a*= 30 Markov’s inequality says that*P*(*X*≥ 30) ≤ 3/30 = 10%. So there is a 10% probability that*X*is greater than 30. - For
*a*= 3 Markov’s inequality says that*P*(*X*≥ 3) ≤ 3/3 = 1. Events with probability of 1 = 100% are certain. So this says that some value of the random variable is greater than or equal to 3. This should not be too surprising. Were all the value of*X*less than 3, then the expected value would also be less than 3. - As the value of
*a*increases, the quotient*E*(*X*) /*a*will become smaller and smaller. This means that the probability is very small that*X*is very, very large. Again, with an expected value of 3, we would not expect there to be much of the distribution with values that were very large.

### Use of the Inequality

If we know more about the distribution that we’re working with, then we can usually improve on Markov’s inequality. The value in using it is that it holds for any distribution with nonnegative values.

For example, if we know the mean height of students at an elementary school. Markov’s inequality tells us that no more than one sixth of the students can have a height greater than six times the mean height.

The other major use of Markov’s inequality is to prove Chebyshev’s inequality. This fact results in the name “Chebyshev’s inequality” being applied to Markov’s inequality as well. The confusion of the naming of the inequalities is also due to historical circumstances. Andrey Markov was the student of Pafnuty Chebyshev. Chebyshev’s work contains the inequality that is attributed to Markov.