Authors
Hansi Seitaj and Vinayak Elangovan, USA
Abstract
This research tackles the challenge of manual data extraction from product labels by employing a blend of computer vision and Natural Language Processing (NLP). We introduce an enhanced model that combines Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in a Convolutional Recurrent Neural Network (CRNN) for reliable text recognition. Our model is further refined by incorporating the Tesseract OCR engine, enhancing its applicability in Optical Character Recognition (OCR) tasks. The methodology is augmented by NLP techniques and extended through the Open Food Facts API (Application Programming Interface) for database population and text-only label prediction. The CRNN model is trained on encoded labels and evaluated for accuracy on a dedicated test set. Importantly, our approach enables visually impaired individuals to access essential information on product labels, such as directions and ingredients. Overall, the study highlights the efficacy of deep learning and OCR in automating label extraction and recognition.
Keywords
Optical Character Recognition (OCR); Machine Vision; Machine Learning; Convolutional Recurrent Neural Network (CRNN); Natural Language Processing (NLP); Text Recognition; Test Classification; Product Labels; Deep Learning; Data Extraction