Читать книгу Advanced Analytics and Deep Learning Models - Группа авторов - Страница 41
2.3.4.3 Removing Outliers
ОглавлениеOutliers are data points or errors, which represent extreme variations in our dataset. There are techniques to detect outlier; one of them is by visualization. We can graph box plot or scatter plot and, from the patterns, draw inference.
In BHK, there are some flat whose average area of one room is larger, which appears unusual, whereas in some instances, the number of bathroom is larger than number of rooms in the house, hence affecting the result.
The scatter chart was plotted to visualize price per square feet for 2 BHK and 3 BHK properties. Here, the blue points represent the 2 BHK and red points as 3 BHK plots. Based on Figures 2.5 through 2.8 the outliers was remove from the Hebbal region using the “remove bhk outliers function”.
Figure 2.5 Bath visualization.
Figure 2.6 BHK visualization.
Figure 2.7 Scatter plot for 2 and 3 BHK flat for total square feet.
Figure 2.8 Scatter plot for 2 And 3 BHK flat for total square feet after removing outliers.