Читать книгу Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP - Bhisham C. Gupta, Irwin Guttman - Страница 81

Empirical Rule

Оглавление

We now illustrate how the standard deviation of a data set helps us measure the variability of the data. If the data have a distribution that is approximately bell‐shaped, the following rule, known as the empirical rule, can be used to compute the percentage of data that will fall within standard deviations from the mean (). For the case where the data set is the set of population values, the empirical rule may be stated as follows:

1 About 68% of the data will fall within one standard deviation of the mean, that is, between and .

2 About 95% of the data will fall within two standard deviations of the mean, that is, between and .

3 About 99.7% of the data will fall within three standard deviations of the mean, that is, between and .

Figure 2.5.3 illustrates these features of the empirical rule.


Figure 2.5.3 Application of the empirical rule.

For the case where μ and σ are unknown, the empirical rule is of the same form, but is replaced by and replaced by .

Example 2.5.11 (Soft drinks) A soft‐drink filling machine is used to fill 16‐oz soft‐drink bottles. The amount of beverage slightly varies from bottle to bottle, and it is assumed that the actual amount of beverage in the bottle forms a bell‐shaped distribution with a mean 15.8 oz and standard deviation 0.15 oz. Use the empirical rule to find what percentage of bottles contain between 15.5 and 16.1 oz of beverage.

Solution: From the information provided to us in this problem, we have oz and oz. We are interested in knowing the percentage of bottles that will contain between 15.5 and 16.1 oz of beverage. We can see that . Then comparing Figure 2.5.4 with Figure 2.5.3, it seems that approximately 95% of the bottles contain between 15.5 and 16.1 oz of the beverage, since 15.5 and 16.1 are two standard deviations away from the mean.


Figure 2.5.4 Distribution of amounts of soft drink contained in bottles.

Example 2.5.12 (Applying the empirical rule) At the end of each fiscal year, a manufacturer writes off or adjusts its financial records to show the number of units of bad production occurring over all lots of production during the year. Suppose that the dollar values associated with the various units of bad production form a bell‐shaped distribution with mean and standard deviation = $2500. Find the percentage of units of bad production that has a dollar value between $28,200 and $43,200.

Solution: From the information provided, we have and = $2500. Since the limits $28,200 and $43,200 are three standard deviations away from the mean, applying the empirical rule shows that approximately 99.7% units of the bad production has dollar value between $28,200 and $43,200.


Figure 2.5.5 Dollar value of units of bad production.

If the population data have a distribution that is not bell‐shaped, then we use another result, called Chebyshev's inequality, which states:

Chebyshev's inequality: For any , at least of the data values fall within standard deviations of the mean.

Figure 2.5.6a,b illustrates the basic concept of Chebyshev's inequality. Chebyshev's inequality is further discussed in Chapter 5.

The shaded area in Figure 2.5.6a contains at least of the data values. The shaded area in Figure 2.5.6b contains at least of the data values. Note that Chebyshev's inequality is also valid for sample data.

Example 2.5.13 (Using Chebyshev's inequality) Sodium is an important component of the metabolic panel. The average sodium level for 1000 American male adults who were tested for low sodium was found to be 132 mEq/L with a standard deviation of 3 mEq/L. Using Chebyshev's inequality, determine at least how many of the adults tested have a sodium level between 124.5 and 139.5 mEq/L.


Figure 2.5.6 Shaded area lies within the intervals: (a) and (b) .

Solution: From the given information, we have that the mean and the standard deviation of sodium level for these adults are


To find how many of 1000 adults have their sodium level between 124.5 and 139.5 mEq/L, we need to determine the value of . Since each of these values is 7.5 points away from the mean, then using Chebyshev's inequality, the value of is such that , so that


Hence, the number of adults in the sample who have their sodium level between 124.5 and 139.5 mEq/L is at least


Numerical measures can easily be determined by using any one of the statistical packages discussed in this book. We illustrate the use of MINITAB and R with the following example. The use of JMP is discussed in Section 2.11, which is available on the book website: www.wiley.com/college/gupta/statistics2e.

Example 2.5.14 (Using MINITAB and R) Calculate numerical measures for the following sample data:

6, 8, 12, 9, 14, 18, 17, 23, 21, 23

Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP

Подняться наверх