Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning
Mahmoud Salhab, Marwan Elghitany, Shameed Sait, Mohammad Abusheikh, Hasan Abusheikh
- Year
- 2025
- Citations
- 3
Abstract
Automatic speech recognition (ASR) plays a vital role in human-machine interaction across a variety of applications, including conversational agents, industrial robotics, call center automation, and automated subtitling. However, building high-performing ASR models remains challenging, especially for low-resource languages like Arabic, due to the limited availability of large, labeled speech datasets, which are expensive and time-consuming to create. In this work, we utilize weakly supervised learning to train an Arabic ASR model based on the Conformer architecture. Our model is trained from scratch on 15,000 hours of weakly annotated speech data, encompassing both Modern Standard Arabic (MSA) and Dialectal Arabic (DA), thereby removing the dependency on costly manual transcriptions. Despite lacking human-verified labels, our method achieves state-of-the-art (SOTA) performance in Arabic ASR, outperforming both open and closed-source models on standard benchmarks. This demonstrates the potential of weak supervision as a scalable and cost-effective alternative to traditional supervised learning, paving the way for enhanced ASR systems in low-resource environments.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
Genetic Programming: On the Programming of Computers by Means of Natural Selection
John R. Koza
1992