Читать книгу Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP - Bhisham C. Gupta, Irwin Guttman - Страница 57

Definition 2.4.1

Оглавление

A histogram is a graphical tool consisting of bars placed side by side on a set of intervals (classes, bins, or cells) of equal width. The bars represent the frequency or relative frequency of classes. The height of each bar is proportional to the frequency or relative frequency of the corresponding class.

To construct a histogram, we take the following steps:

1 Step 1. Prepare a frequency distribution table for the given data.

2 Step 2. Use the frequency distribution table prepared in Step 1 to construct the histogram. From here, the steps involved in constructing a histogram are exactly the same as those to construct a bar chart, except that in a histogram, there is no gap between the intervals marked on the horizontal axis (the ‐axis).

A histogram is called a frequency histogram or a relative frequency histogram depending on whether the scale on the vertical axis (the ‐axis) represents the frequencies or the relative frequencies. In both types of histograms, the widths of the rectangles are equal to the class width. The two types of histograms are in fact identical except that the scales used on the ‐axes are different. This point becomes quite clear in the following example:

Example 2.4.5 (Survival times) The following data give the survival times (in hours) of 50 parts involved in a field test under extraneous operating conditions.

60 100 130 100 115 30 145 75 80 89 57 64 92 87 110 180
195 175 179 159 155 146 157 167 174 87 67 73 109 123 135 129 141
154 166 179 37 89 39 49 190

Construct a frequency distribution table for this data. Then, construct frequency and relative frequency histograms for these data.

Solution:

1 Step 1. Find the range of the data:Then, determine the number of classes (see for example the Sturges' formula , in (2.3.2))Last, compute the class width:As we noted earlier, the class width number is always rounded up to another convenient number that is easy to work with. If the number calculated using (2.3.4) is rounded down, then some of the observations will be left out as they will not belong to any class. Consequently, the total frequency will be less than the total count of the data. The frequency distribution table for the data in this example is shown in Table 2.4.3.

2 Step 2. Having completed the frequency distribution table, construct the histograms. To construct the frequency histogram, first mark the classes on the ‐axis and the frequencies on the ‐axis. Remember that when marking the classes and identifying the bins on the ‐axis, there must be no gap between them. Then, on each class marked on the ‐axis, place a rectangle, where the height of each rectangle is proportional to the frequency of the corresponding class. The frequency histogram for the data with the frequency distribution given in Table 2.4.3 is shown in Figure 2.4.5. To construct the relative frequency histogram, the scale is changed on the ‐axis (see Figure 2.4.5) so that instead of plotting the frequencies, we plot relative frequencies. The resulting graph for this example, shown in Figure 2.4.6, is called the relative frequency histogram for the data with relative frequency distribution given in Table 2.4.3.

Table 2.4.3 Frequency distribution table for the survival time of parts.

Frequency Relative Cumulative
Class Tally or count frequency frequency
///// 5 5/50 5
///// ///// 10 10/50 15
///// //// 9 9/50 24
///// // 7 7/50 31
///// / 6 6/50 37
///// / 6 6/50 43
///// // 7 7/50 50
Total 50 1

Figure 2.4.5 Frequency histogram for survival time of parts under extraneous operating conditions.


Figure 2.4.6 Relative frequency histogram for survival time of parts under extraneous operating conditions.

Another graph that becomes the basis of probability distributions, which we will study in later chapters, is called the frequency polygon or relative frequency polygon depending on which histogram is used to construct this graph. To construct the frequency or relative frequency polygon, first mark the midpoints on the top ends of the rectangles of the corresponding histogram and then simply join these midpoints. Note that classes with zero frequencies at the lower as well as at the upper end of the histogram are included so that we can connect the polygon with the ‐axis. The lines obtained by joining the midpoints are called the frequency or relative frequency polygons, as the case may be. The frequency polygon for the data in Example 2.4.5 is shown in Figure 2.4.7. As the frequency and the relative frequency histograms are identical in shape, the frequency and relative frequency polygons are also identical, except for the labeling of the ‐axis.

Quite often a data set consists of a large number of observations that result in a large number of classes of very small widths. In such cases, frequency polygons or relative frequency polygons become smooth curves. Figure 2.4.8 shows one such smooth curve. Such smooth curves, usually called frequency distribution curves, represent the probability distributions of continuous random variables that we study in Chapter c05. Thus, the histograms eventually become the basis for information about the probability distributions from which the sample was obtained.


Figure 2.4.7 Frequency polygon for survival time of parts under extraneous operating conditions.


Figure 2.4.8 Typical frequency distribution curve.


Figure 2.4.9 Three typical types of frequency distribution curves.

The shape of the frequency distribution curve of a data set depends on the shape of its histogram and choice of class or bin size. The shape of a frequency distribution curve can in fact be of any type, but in general, we encounter the three typical types of frequency distribution curves shown in Figure 2.4.9.

We now turn to outlining the various steps needed when using MINITAB and R.

Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP

Подняться наверх