In many statistical problems we are required to determine the degrees of freedom. This refers to a positive whole number that indicates the lack of restrictions in our calculations. The degree of freedom is the number of values in a calculation that we can vary.
A Few Examples
For a moment suppose that we know the mean of data is 25 and that the values are 20,10, 50, and one unknown value. To find the mean of a list of data, we add all of the data and divide by the total number of values. This gives us the formula (20 + 10 + 50 + x)/4 = 25, where x denotes the unknown . Despite calling this unknown, we can use some algebra to determine that x = 20.
Let's alter this scenario slightly. Instead we suppose that we know the mean of a data set is 25, with values 20, 10, and two unknown values. These unknowns could be different, so we use two different variables, x and y to denote this. The resulting formula is (20 + 10 + x + y)/4 = 25. With some algebra we obtain y = 70 - x. The formula is written in this form to show that once we choose a value for x, the value for y is determined. This shows that there is one degree of freedom.
Now we'll look at a sample size of one hundred. If we know that the mean of this sample data is 20, but do not know the values of any of the data, then there are 99 degrees of freedom. All values must add up to a total of 20 x 100 = 2000. Once we have the values of 99 elements in the data set, then the last one has been determined.
Student t Distribution
Degrees of freedom play an important role when using the Student t-score table. There are actually several t-score distributions. We differentiate between these distributions by use of degrees of freedom. Here the probability distribution that we use depends upon the size of our sample. If our sample size is n, then the number of degrees of freedom is n - 1. For instance, a sample size of 22 would require us to use the row of the t-score table with 21 degrees of freedom.
The use of a chi-square distribution also requires the use of degrees of freedom. Here, in an identical manner as with the t distribution, the sample size determines which distribution to use. If the sample size is n, then there are n - 1 degrees of freedom.
Another place where degrees of freedom show up is in the formula for the standard deviation. This occurrence is not as overt, but we can see it if we know where to look. To find a standard deviation we are looking for the "average" deviation from the mean. However after subtracting the mean from each data value and squaring the differences, we end up dividing by n - 1 rather than n as we might expect.
The presence of the n - 1 comes from the number of degrees of freedom. Since the n data values and the sample mean are being used in the formula, there are n - 1 degrees of freedom.
More advanced statistical techniques use more complicated ways of counting the degrees of freedom. When calculating the test statistic for two means with independent samples of n1 and n2 elements, the number of degrees of freedom has quite a complicated formula. It can be estimated by using the smaller of n1 - 1 and n2 - 1
Another example of a different way to count the degrees of freedom comes with an F test. In conducting an F test we have k samples each of size n. The degrees of freedom in the numerator is k - 1 and in the denominator is k(n - 1).