首页 /研究 /Research on Digital Human Speech Recognition Method in High-Disturbance Industrial Environment
LEARNING

Research on Digital Human Speech Recognition Method in High-Disturbance Industrial Environment

Pengyu Zhu, Xiaobin Li, Sun Haiyan, Zhuoyi Chen

发表年份
2025
引用次数
2

摘要

The advent of industrial robotics and speech technology has precipitated a paradigm shift in the manner in which humans and machines collaborate. This paper investigates the application and optimisation of digital human speech recognition technology in industrial contexts, with a particular focus on scenarios characterised by significant disturbances. The presence of noise and dynamic working conditions in industrial factories renders traditional speech recognition systems susceptible to a range of challenges, including real-time algorithms, anti-disturbance measures, versatility, accuracy, and other factors. To address these challenges, this paper proposes the construction of a real-time system for speech recognition of digital humans with high perturbation resistance. The proposed system integrates a speech endpoint detection model, a speech denoising model, a speech text recognition model and a voiceprint recognition model. Additionally, a target speech segment duration algorithm based on GMM is introduced to enhance the real-time and robustness of digital human speech recognition. A speech-to-speech deep learning model employing conditional VAE and adversarial training methods is proposed to address the significant decline in system recognition rate in low SNR environments. The experimental findings, obtained in a mixed industrial continuous and impulse noise environment, demonstrate that the enhanced digital human speech recognition system can elevate the correct text recognition rate and the correct identity recognition rate by up to 65.23% and 61%, respectively, in a high-noise industrial environment. This observation signifies that the industrial digital human speech system delineated in this paper possesses the capability and worth to be implemented in industrial environments, thereby offering a valuable solution.

关键词

Disturbance (geology)Computer scienceSpeech recognition

相关论文

查看 LEARNING 分类全部论文