BORO-LILABOTI - Building of Robust Orders for Leveraging Isolated Letter Accumulations By Ordering Teacher Insights for Bangla Handwriting Recognition
T. Shahjahan, M. Ahsan, M. Islam, and 3 more authors
One of the key aspects of artificial intelligence is that it enables human interaction through vision and language. As a result, we have many useful applications, such as Optical Character Recognition (OCR). Many industries have successfully fabricated OCR for printed texts with high accuracy. Most of the time, language is resourceful and does not cause morphological difficulties. OCR performs appallingly in recognizing morphologically rich and resource-limited texts, but it shows improvement in the context of handwritten text recognition. Any morphologically rich language, such as Bangla, Hindi, Arabic, Hebrew, etc., encounters class imbalance issues because of the language composition. Also, low-resource languages have less data available for training, which leads to a biased, non-convergent model. To address this issue, particularly in the Bangla language, we introduce an enhanced knowledge distillation-based approach: Unlocking Bias Mitigation in Bangla Handwritten Word Recognition Through Inter-Linguistic Character-Level Teacher Model Insights (BORO LILA-BOTI). This approach mitigates the bias toward major classes significantly. Our approach comprises a Convolutional Recurrent Neural Network (CRNN) student model that gains inter-linguistic insights from languages with similar character shape patterns. To evaluate the effectiveness of our model on unseen data, we perform inter-dataset training and testing on Bangla Writing and BN-HTRD. Our evaluations achieved a 3.96% and 8.35% increase in the F1-micro score for minor classes in Bangla Writing and BN-HTRD, respectively, compared to the base (No KD) model. It also surpasses the state-of-the-art f1 micro-score for the minor classes, which is Super teacher LILA-BOTI.