首页 /研究 /Self-Supervised Voice Denoising Network for Multi-Scenario Human–Robot Interaction

HRI

Self-Supervised Voice Denoising Network for Multi-Scenario Human–Robot Interaction

Mu Li, Wenjin Xu, Chao Zeng, Ning Wang

发表年份: 2025
引用次数: 1
访问权限: 开放获取

摘要

Human-robot interaction (HRI) via voice command has significantly advanced in recent years, with large Vision-Language-Action (VLA) models demonstrating particular promise in human-robot voice interaction. However, these systems still struggle with environmental noise contamination during voice interaction and lack a specialized denoising network for multi-speaker command isolation in an overlapping speech scenario. To overcome these challenges, we introduce a method to enhance voice command-based HRI in noisy environments, leveraging synthetic data and a self-supervised denoising network to enhance its real-world applicability. Our approach focuses on improving self-supervised network performance in denoising mixed-noise audio through training data scaling. Extensive experiments show our method outperforms existing approaches in simulation and achieves 7.5% higher accuracy than the state-of-the-art method in noisy real-world environments, enhancing voice-guided robot control.

关键词

Noise reductionNoise (video)Video denoisingSpeech enhancementVoice activity detectionReduction (mathematics)Speech processingRobot

Self-Supervised Voice Denoising Network for Multi-Scenario Human–Robot Interaction

摘要

关键词

相关论文

Artificial intelligence: a modern approach

Self-Organizing Maps

Vision meets robotics: The KITTI dataset

Probabilistic robotics