Читать книгу Handbook of Intelligent Computing and Optimization for Sustainable Development - Группа авторов - Страница 78
3.3.1 Obtaining the Foreground Information
ОглавлениеThe proposed approach processes an input video from the dataset on a per-frame basis. Before processing the input video frame for garment identification, the input video frame is converted from RGB color space to HSV color space, so that the pixel intensity can be distinguished from the color information. To obtain the foreground information, we use a background subtraction model inline with an object detection algorithm, which is known as Mask R-CNN. The background subtraction model identifies the pixels associated with non-static objects present in a particular frame such as an instance a customer picks up a garment he finds interesting. As the garments worn by the customers are also included in this foreground, the Mask R-CNN model is utilized to identify customers and obtain the pixels associated with the customers alone. These pixels are then excluded from the foreground obtained by the background subtraction algorithm, thereby ensuring that only pixels associated with the garments at the store are considered by the subsequent stages of the proposed framework.
Figure 3.1 Architecture of proposed framework.