Читать книгу End-to-end Data Analytics for Product Development - Chris Jones - Страница 19

Stat Tool 1.6 Measures of Central Tendency: Mean and Median

Оглавление

When quantitative data distributions tend to concentrate around certain values, we can try to locate these values by calculating the so‐called measures of central tendency: the mean and the median. These measures describe the area of the distribution where most values occur.

The mean is the sum of all data divided by the number of data. It represents the “balance point” of a set of values.


The median is the middle value in a sorted list of data. It divides data in half: 50% of data are greater than the median, 50% are less than the median.


For symmetric data, mean and median tend to be close in value (Figure 1.5):


Figure 1.5 Mean and median in symmetric distributions.

In skewed data or data with extreme values, mean and median can be quite different. Usually for such data, the median tends to be a better indicator of the central tendency rather than the mean, because while the mean tends to be pulled in the direction of the skew, the median remains closer to the majority of the observations (Figure 1.6).


Figure 1.6 Mean and median in skewed distributions.

End-to-end Data Analytics for Product Development

Подняться наверх