Audio-Visual Depth and Material Estimation for Robot Navigation
Justin Wilson, Nicholas Rewkowski, Ming C. Lin
- 发表年份
- 2022
- 引用次数
- 3
摘要
Reflective and textureless surfaces such as windows, mirrors, and walls can be a challenge for scene reconstruction, due to depth discontinuities and holes. We propose an audio-visual method that uses the reflections of sound to aid in depth estimation and material classification for 3D scene reconstruction in robot navigation and AR/VR applications. The mobile phone prototype emits pulsed audio, while recording video for audio-visual classification for 3D scene reconstruction. Reflected sound and images from the video are input into our audio (EchoCNN-A) and audio-visual (EchoCNN-AV) convolutional neural networks for surface and sound source detection, depth estimation, and material classification. The inferences from these classifications enhance 3D scene reconstructions containing open spaces and reflective surfaces by depth filtering, inpainting, and placement of unmixed sound sources in the scene. Our prototype, demos, and experimental results from real-world with challenging surfaces and sound, also validated with virtual scenes, indicate high success rates on classification of material, depth estimation, and closed/open surfaces, leading to considerable improvement in 3D scene reconstruction for robot navigation.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002