Читать книгу Medical Statistics - David Machin - Страница 52

Standard Deviation and Variance

Оглавление

A third measure of the amount of spread or variability in a data set is the standard deviation. It is based on the idea of averaging the distance each value is away from the sample mean, . For an individual with an observed value xi the distance from the mean is . With n such observations we have a set of n such differences, one for each individual. The sum of these differences, is always zero. However, if we square the distances before we sum them we get a positive quantity. This sum is then divided by (n−1) and thus gives an average measure for the deviation from the mean. This quantity is called the variance and is defined as:

Table 2.5 Calculating the median, quartiles, and interquartile range for the corn size data.



The variance is expressed in square units and so is not a suitable measure for describing variability because it is not in the same units as the raw data. The solution is to take the square root of the variance to return to the original units. This gives us the standard deviation (usually abbreviated to SD or s) defined as:


Examining this expression it can be seen that if all the x's were the same, then they would all equal and so s would be zero. If the x's were widely scattered about , then s would be large. In this way s reflects the variability in the data.

Medical Statistics

Подняться наверх