Читать книгу Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP - Bhisham C. Gupta, Irwin Guttman - Страница 49

Solution: MINITAB

Оглавление

1 Enter the data in column C1 of the Worksheet Window and name it Categories.

2 From the Menu bar, select Stat Tables Tally Individual Variables

3 In this dialog box, enter C1 in the box under Variables.

4 Check all the boxes under Display and click OK.

5 The frequency distribution table as shown below appears in the Session window.

This frequency distribution table may also be obtained by using R as follows:

USING R

R has built in ‘table()’ function that can be used to get the basic frequency distribution of categorical data. To get the cumulative frequencies, we can apply built in ‘cumsum()’ function to tabulated frequency data. Then using the ‘cbind()’ function we combine categories, frequencies, cumulative frequencies, and cumulative percentages to build the final distribution table. In addition, we can use the ‘colnames()’ function to name the columns of the final table as needed. The task can be completed running the following R code in R Console window.

#Assign given data to the variable data data = c(4,3,5,3,4,1,2,3,4,3,1,5,3,4,2,1,1,4,5,3,2,5,2,5,2,1,2,3,3,2, 1,5,3,2,1,1,2,1,2,4,5,3,5,1,3,1,2,1,4,1,4,5,4,1,1,2,4,1,4,1,2,4,3,4,1, 4,1,4,1,2,1,5,3,1,5,2,1,2,3,1,2,2,1,1,2,1,5,3,2,5,5,2,5,3,5,2,3,2,3,5, 2,3,5,5,2,3,2,5,1,4) #To get frequencies data.freq = table(data) #To combine necessary columns freq.dist = cbind(data.freq, cumsum(data.freq), 100*cumsum(data.freq)/sum(data.freq)) #To name the table columns colnames(freq.dist) = c(‘Frequency’,‘Cum.Frequency’,‘Cum Percentage’) freq.dist #R output

Frequency Cum.Frequency Cum Percentage
1 28.00 28.00 25.45
2 26.00 54.00 49.09
3 20.00 74.00 67.27
4 16.00 90.00 81.82
5 20.00 110.00 100.00

Note that sometimes a quantitative data set is such that it consists of only a few distinct observations that occur repeatedly. These kind of data are usually summarized in the same manner as the categorical data. The categories are represented by the distinct observations. We illustrate this scenario with the following example.

Example 2.3.3 (Hospital data) The following data show the number of coronary artery bypass graft surgeries performed at a hospital in a 24‐hour period for each of the last 50 days. Bypass surgeries are usually performed when a patient has multiple blockages or when the left main coronary artery is blocked. Construct a frequency distribution table for these data.

1 2 1 5 4 2 3 1 5 4 3 4 6 2 3 3 2 2 3 5 2 5 3 4 3
1 3 2 2 4 2 6 1 2 6 6 1 4 5 4 1 4 2 1 2 5 2 2 4 3

Solution: In this example, the variable of interest is the number of bypass surgeries performed at a hospital in a period of 24 hours. Now, following the discussion in Example 2.3.1, we can see that the frequency distribution table for the data in this example is as shown in Table 2.3.3. Frequency distribution table defined by using a single numerical value is usually called a single‐valued frequency distribution table.

Table 2.3.3 Frequency distribution table for the hospital data.

Frequency Cumulative Cumulative
Categories Tally or count frequency Percentage percentage
1 ///// /// 8 8 16.00 16.00
2 ///// ///// //// 14 22 28.00 44.00
3 ///// //// 9 31 18.00 62.00
4 ///// //// 9 40 18.00 80.00
5 ///// / 6 46 12.00 92.00
6 //// 4 50 8.00 100.00
Total 50 100.00
Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP

Подняться наверх