The correlation coefficient, denoted by *r* tells us how closely data in a scatterplot fall along a straight line. The closer that the absolute value of *r* is to one, the closer that the data are described by a linear equation. Data with values of *r* close to zero show little to no straight-line relationship. Due to the lengthy calculations, it is best to calculate *r* with the use of a calculator or statistical software. However, it is always a worthwhile endeavor to know what your calculator is doing when it is calculating. What follows is a process for calculating the correlation coefficient mainly by hand, with a calculator used for the routine arithmetic steps.

### Steps for Calculating *r*

We will begin by listing the steps to the calculation of the correlation coefficient. The data we are working with are paired data, each pair of which will be denoted by (*x _{i},y_{i}*).

- We begin with a few preliminary calculations. The quantities from these calculations will be used in subsequent steps of our calculation of
*r*: - Calculate x̄, the mean of all of the first coordinates of the data
*x*._{i} - Calculate ȳ, the mean of all of the second coordinates of the data
*y*._{i} - Calculate
*s*the sample standard deviation of all of the first coordinates of the data_{ x}*x*._{i} - Calculate
*s*the sample standard deviation of all of the second coordinates of the data_{ y}*y*._{i} - Use the formula
*(z*= (_{x})_{i}*x*– x̄) /_{i}*s*and calculate a standardized value for each_{ x}*x*._{i} - Use the formula
*(z*= (_{y})_{i}*y*– ȳ) /_{i}*s*and calculate a standardized value for each_{ y}*y*._{i} - Multiply corresponding standardized values:
*(z*_{x})_{i}*(z*_{y})_{i} - Add the products from the last step together.
- Divide the sum from the previous step by
*n*– 1, where*n*is the total number of points in our set of paired data. The result of all of this is the correlation coefficient*r*.

### An Example

To see exactly how the value of *r* is obtained we look at an example. Again, it is important to note that for practical applications we would want to use our calculator or statistical software to calculate *r* for us.

We begin with a listing of paired data: (1, 1), (2, 3), (4, 5), (5,7). The mean of the *x* values, the mean of 1, 2, 4, and 5 is x̄ = 3. We also have that ȳ = 4. The standard deviation of the *x* values is *s _{x}* = 1.83 and

*s*= 2.58. The table below summarizes the other calculations needed for

_{y}*r*. The sum of the products in the rightmost column is 2.969848. Since there are a total of four points and 4 – 1 = 3, we divide the sum of the products by 3. This gives us a correlation coefficient of

*r*= 2.969848/3 = 0.989949.

## Table for Example of Calculation of Correlation Coefficient

x |
y |
z_{x} |
z_{y} |
z_{x}z_{y} |
---|---|---|---|---|

1 | 1 | -1.09544503 | -1.161894958 | 1.272792057 |

2 | 3 | -0.547722515 | -0.387298319 | 0.212132009 |

4 | 5 | 0.547722515 | 0.387298319 | 0.212132009 |

5 | 7 | 1.09544503 | 1.161894958 | 1.272792057 |