首页 /研究 /Self-Supervised Voice Denoising Network for Multi-Scenario Human–Robot Interaction
HRI

Self-Supervised Voice Denoising Network for Multi-Scenario Human–Robot Interaction

Mu Li, Wenjin Xu, Chao Zeng, Ning Wang

发表年份
2025
引用次数
1
访问权限
开放获取

摘要

Human-robot interaction (HRI) via voice command has significantly advanced in recent years, with large Vision-Language-Action (VLA) models demonstrating particular promise in human-robot voice interaction. However, these systems still struggle with environmental noise contamination during voice interaction and lack a specialized denoising network for multi-speaker command isolation in an overlapping speech scenario. To overcome these challenges, we introduce a method to enhance voice command-based HRI in noisy environments, leveraging synthetic data and a self-supervised denoising network to enhance its real-world applicability. Our approach focuses on improving self-supervised network performance in denoising mixed-noise audio through training data scaling. Extensive experiments show our method outperforms existing approaches in simulation and achieves 7.5% higher accuracy than the state-of-the-art method in noisy real-world environments, enhancing voice-guided robot control.

关键词

Noise reductionNoise (video)Video denoisingSpeech enhancementVoice activity detectionReduction (mathematics)Speech processingRobot

相关论文

查看 HRI 分类全部论文