Читать книгу Student Study Guide to Accompany Statistics Alive! - Wendy J. Steinberg - Страница 11
Module Summary
ОглавлениеAn entire data set can be described in a single number with a measure of central tendency. Central tendency provides a single value that best describes, or is most representative of, the entire set of scores. It enables you to quickly determine the center, which is usually the location of the majority of scores, of a large group of data.
The mode is the most commonly occurring score in the data set. Although the mode is referred to as a measure of central tendency, it does not necessarily occur at the center of the data set (there may not be equal amounts of scores above and below the mode). The mode is the least stable measure of central tendency, meaning that it may change drastically from sample to sample of a population. This aspect of the mode reduces how often it is used.
The median is the center score; half of the scores in the distribution are above the median, and half of the scores in the distribution are below it. In other words, the median is the score that occurs at the 50th percentile. The median does not have to be an actual score. For example, the mean of the distribution 3, 4, 6, 7 is 5. If you do not know all of the specific scores in a data set, you can use the following formula for a precise measure of the median:
The mean is the average score for a data set and is symbolized as M for samples and as μ for populations. The mean is the most commonly used and most stable measure of central tendency. The formula for a mean is as follows:
There are three important aspects to the mean. First, the numerical weight of the scores above the mean is equal to the numerical weight of the scores below the mean. This indicates that if you were to find the distance of all the scores from the mean (i.e., X − M), the sum of the distances for the scores below the mean would be equal to the sum of the distances for the scores above the mean. Second, the mean includes all values of the data in its calculation, which indicates that each score in the distribution matters. Finally, the mean is also a sensitive measure of central tendency, in that a change in any score in the data set will change the mean.
An extreme score in a data set, one that is drastically different from the others, is called an outlier. Outliers can influence which measure of central tendency is most appropriate to use. Because the mean is so sensitive to score values, the median may be a more appropriate measure if there are many outliers in a data set.
The skew of a distribution will affect the location of the measures of central tendency. In a symmetrical distribution, the mean, median, and mode are all equal. In a skewed distribution, however, the values of the measures of central tendency are as follows: positively skewed distributions: mode < median < mean; negatively skewed distributions: mode > median > mean.
It is not appropriate to report any single measure of central tendency when you have multimodal data. If reporting the mode, report multiple modes. A graph is the best method for displaying this distribution.