首页 /研究 /Detection and Recognition of Bilingual Urdu and English Text in Natural Scene Images Using a Convolutional Neural Network–Recurrent Neural Network Combination with a Connectionist Temporal Classification Decoder
LEARNING

Detection and Recognition of Bilingual Urdu and English Text in Natural Scene Images Using a Convolutional Neural Network–Recurrent Neural Network Combination with a Connectionist Temporal Classification Decoder

Khadiza Tul Kubra, Muhammad Umair, Muhammad Zubair, Muhammad Naseem, Chan-Su Lee

发表年份
2025
引用次数
2
访问权限
开放获取

摘要

Urdu and English are widely used for visual text communications worldwide in public spaces such as signboards and navigation boards. Text in such natural scenes contains useful information for modern-era applications such as language translation for foreign visitors, robot navigation, and autonomous vehicles, highlighting the importance of extracting these texts. Previous studies focused on Urdu alone or printed text pasted manually on images and lacked sufficiently large datasets for effective model training. Herein, a pipeline for Urdu and English (bilingual) text detection and recognition in complex natural scene images is proposed. Additionally, a unilingual dataset is converted into a bilingual dataset and augmented using various techniques. For implementations, a customized convolutional neural network is used for feature extraction, a recurrent neural network (RNN) is used for feature learning, and connectionist temporal classification (CTC) is employed for text recognition. Experiments are conducted using different RNNs and hidden units, which yield satisfactory results. Ablation studies are performed on the two best models by eliminating model components. The proposed pipeline is also compared to existing text detection and recognition methods. The proposed models achieved average accuracies of 98.5% for Urdu character recognition, 97.2% for Urdu word recognition, and 99.2% for English character recognition.

关键词

Computer scienceArtificial intelligenceConvolutional neural networkPipeline (software)ConnectionismFeature (linguistics)Speech recognitionFeature extractionPattern recognition (psychology)Recurrent neural network

相关论文

查看 LEARNING 分类全部论文