Читать книгу Data Science For Dummies - Lillian Pierson - Страница 69

Ranking variable-pairs using Spearman’s rank correlation

Оглавление

The Spearman’s rank correlation is a popular test for determining correlation between ordinal variables. By applying Spearman’s rank correlation, you’re converting numeric variable-pairs into ranks by calculating the strength of the relationship between variables and then ranking them per their correlation.

The Spearman’s rank correlation assumes that

 Your variables are ordinal.

 Your variables are related nonlinearly. You can identify nonlinearity between variables by looking at a graph. If the graph between two variables produces a curve (for example, like the one shown in Figure 4-2) then the variables have a nonlinear relationship. This curvature occurs because, with variables related in a non-linear manner, a change in the value of x does not necessarily correspond to the same change in dataset’s y-value.

 Your data is nonnormally distributed.

To use Spearman Rank to test for correlation between ordinal variables, you’d simply plug the values for your variables into the following formula and calculate the result.


 ρ = Spearman's rank correlation coefficient

 d = difference between the two ranks of each data point

 n = total number of data points in the data set


FIGURE 4-2: An example of a non-linear relationship between watch time and % viewership.

Data Science For Dummies

Подняться наверх