Читать книгу Machine Vision Inspection Systems, Machine Learning-Based Approaches - Группа авторов - Страница 36
2.5.2 Challenges and Future Research Directions
ОглавлениеAlthough the proposed solution has shown more than 50% accuracy, which is the general threshold for the tested languages, for most of the alphabet types in Omniglot dataset, it has used a small set of images to achieve that accuracy. This limitation can be surpassed by using handcrafted features, which is time-consuming.
In the proposed capsule layers-based Siamese network model, the accuracy of the within language classifications depends on two factors: the number of characters in the alphabet and visual difference between characters. Some alphabets have visually similar characters. In such cases, although the number of characters in the alphabet is small, the classification accuracy becomes low. Thus, the system architecture can be improved with the representation of the image features using transfer learning. Here, features can be extracted from each character image, using a pre-trained deep neural network, and those images can pass to the Siamese network.
This study can be extended by integrating model in a complete OCR pipeline incorporating a character segmentation and reconstruction algorithm. Also, it is possible to analyse the applicability of the proposed model with complex datasets such as ImageNet [40] and COCO [41] by deepening the Siamese network. Additionally, the knowledge learnt from printed character classification can be used to classify handwritten characters. Further, the model classification accuracy can be improved by using printed characters to train the network at initial stages and then using handwritten characters. This will allow the network to understand the defining attributes of each character and such dataset can be generated easily.