首页 /研究 /Sound Source Localization Using Deep Learning for Human–Robot Interaction Under Intelligent Robot Environments

LEARNING

Sound Source Localization Using Deep Learning for Human–Robot Interaction Under Intelligent Robot Environments

Hollinger Jo, Taewan Kim, Keun-Chang Kwak

发表年份: 2025
引用次数: 8
访问权限: 开放获取

摘要

In this paper, we propose Sound Source Localization (SSL) using deep learning for Human–Robot Interaction (HRI) under intelligent robot environments. The proposed SSL method consists of three steps. The first step preprocesses the sound source to minimize noise and reverberation in the robotic environment. Excitation source information (ESI), which contains only the original components of the sound source, is extracted from a sound source in a microphone array mounted on a robot to minimize background influence. Here, the linear prediction residual is used as the ESI. Subsequently, the cross-correlation signal between each adjacent microphone pair is calculated by using the ESI signal of each sound source. To minimize the influence of noise, a Generalized Cross-Correlation with the phase transform (GCC-PHAT) algorithm is used. In the second step, we design a single-channel, multi-input convolutional neural network that can independently learn the calculated cross-correlation signal between each adjacent microphone pair and the location of the sound source using the time difference of arrival. The third step classifies the location of the sound source after training with the proposed network. Previous studies have primarily used various features as inputs and stacked them into multiple channels, which made the algorithm complex. Furthermore, multi-channel inputs may not be sufficient to clearly train the interrelationship between each sound source. To address this issue, the cross-correlation signal between each sound source alone is used as the network input. The proposed method was verified on the Electronics and Telecommunications Research Institute-Sound Source Localization (ETRI-SSL) database acquired from the robotic environment. The experimental results revealed that the proposed method showed an 8.75% higher performance in comparison to the previous works.

关键词

Acoustic source localizationComputer scienceMicrophone arrayRobotReverberationSIGNAL (programming language)MultilaterationMicrophoneCross-correlationNoise (video)

Sound Source Localization Using Deep Learning for Human–Robot Interaction Under Intelligent Robot Environments

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory