Читать книгу Industrial Data Analytics for Diagnosis and Prognosis - Yong Chen - Страница 24

Relationship Between Two Categorical Variables – Mosaic Plot

Оглавление

We can use a mosaic plot to see how values of two categorical variables are related to each other. Figure 2.6 shows a mosaic plot for fuel.type and aspiration of the auto_spec data set, which is drawn by the following R codes.


Figure 2.6 Mosaic plot for fuel type and aspiration.

mosaicplot(fuel.type ~ aspiration, data = auto.spec.df,

xlab = "Fuel Type", ylab = "Aspiration",

color = c("green", "blue"),

main = "Mosaic Plot")

In a mosaic plot, the height of a bar represents the percentage for each value of the variable in the vertical axis given a fixed value of the variable in the horizontal axis. For example, in Figure 2.6 the height of the bar corresponding to turbo aspiration is much higher when the fuel type is diesel than when it is gas, which means a higher percentage of diesel cars use turbo aspiration, while a lower percentage of gasoline cars use turbo aspiration. The width of a bar in a mosaic plot corresponds to the frequency, or the number of observations, for each value of the variable in the horizontal axis. For example, from Figure 2.6, the bars for gas fuel type is much wider than those for diesel fuel type, indicating that a much larger number of cars are gasoline cars in the data set.

Industrial Data Analytics for Diagnosis and Prognosis

Подняться наверх