Home /Research /Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning
OTHER

Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning

Mahmoud Salhab, Marwan Elghitany, Shameed Sait, Mohammad Abusheikh, Hasan Abusheikh

Year
2025
Citations
3

Abstract

Automatic speech recognition (ASR) plays a vital role in human-machine interaction across a variety of applications, including conversational agents, industrial robotics, call center automation, and automated subtitling. However, building high-performing ASR models remains challenging, especially for low-resource languages like Arabic, due to the limited availability of large, labeled speech datasets, which are expensive and time-consuming to create. In this work, we utilize weakly supervised learning to train an Arabic ASR model based on the Conformer architecture. Our model is trained from scratch on 15,000 hours of weakly annotated speech data, encompassing both Modern Standard Arabic (MSA) and Dialectal Arabic (DA), thereby removing the dependency on costly manual transcriptions. Despite lacking human-verified labels, our method achieves state-of-the-art (SOTA) performance in Arabic ASR, outperforming both open and closed-source models on standard benchmarks. This demonstrates the potential of weak supervision as a scalable and cost-effective alternative to traditional supervised learning, paving the way for enhanced ASR systems in low-resource environments.

Keywords

Modern Standard ArabicVariety (cybernetics)ArabicDependency (UML)ScalabilitySupervised learning

Related papers

Browse all OTHER papers