Audio visual

Related papers: 20

Top Cited Papers

Where am I? Scene Recognition for Mobile Robots using Audio Features

Selina Chu, Shrikanth Narayanan, C.‐C. Jay Kuo, Maja Matarić

Citations: 151 • 2006

Multimodal Sparse Transformer Network for Audio-Visual Speech Recognition

Qiya Song, Bin Sun, Shutao Li

Citations: 115 • 2022

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Jiasen Lu, Christopher M. Clark, Sang-Ho Lee, Zichen Zhang, Savya Khosla, Ryan Marten, Derek Hoiem, Aniruddha Kembhavi

Citations: 55 • 2024

Audio-visual classification and detection of human manipulation actions

Alessandro Pieropan, Giampiero Salvi, Karl Pauwels, Hedvig Kjellström

Citations: 45 • 2014

SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments

Derek Hoiem, Yan Ke, Rahul Sukthankar

Citations: 45 • 2006

Audio-Visual Cross-Attention Network for Robotic Speaker Tracking

Xinyuan Qian, Zhengdong Wang, Jiadong Wang, Guohui Guan, Haizhou Li

Citations: 37 • 2022

Audio-Visual Stimuli Change not Only Robot’s Hug Impressions but Also Its Stress-Buffering Effects

Masahiro Shiomi, Norihiro Hagita

Citations: 37 • 2019

Automatic speech recognition improved by two-layered audio-visual integration for robot audition

Takami Yoshida, Kazuhiro Nakadai, Hiroshi G. Okuno

Citations: 35 • 2009

Robot musical accompaniment: integrating audio and visual cues for real-time synchronization with a human flutist

Angelica Lim, Teruhiro Mizumoto, Louis-Kenzo Cahier, T. Otsuka, Tôru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Citations: 34 • 2010

Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop

Citations: 32 • 2019

Exploiting the Complementarity of Audio and Visual Data in Multi-speaker Tracking

Yutong Ban, Laurent Girin, Xavier Alameda-Pineda, Radu Horaud

Citations: 27 • 2017

Human-Robot Interaction in Real Environments by Audio-Visual Integration

Hyun-Don Kim, JongSuk Choi, Munsang Kim

Citations: 25 • 2007

AVOT: Audio-Visual Object Tracking of Multiple Objects for Robotics

Justin Wilson, Ming C. Lin

Citations: 22 • 2020

Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning

Maximilian Du, Olivia Y Lee, Suraj Nair, Chelsea Finn

Citations: 21 • 2022

Layered telepresence

Mhd Yamen Saraiji, Shota Sugimoto, Charith Lasantha Fernando, Kouta Minamizawa, Susumu Tachi

Citations: 20 • 2016

Ava Active Speaker: An Audio-Visual Dataset for Active Speaker Detection

Joseph Roth, Sourish Chaudhuri, Ondřej Klejch, Radhika Marvin, Andrew Gallagher, Liat Kaver, Sharadh Ramaswamy, Arkadiusz Stopczynski, Cordelia Schmid, Zhonghua Xi, Caroline Pantofaru

Citations: 19 • 2020

Audio-Visual Grounding Referring Expression for Robotic Manipulation

Yefei Wang, Kaili Wang, Wang Yi, Di Guo, Huaping Liu, Fuchun Sun

Citations: 18 • 2022

Far-Field Audio-Visual Scene Perception of Multi-Party Human-Robot Interaction for Children and Adults

Antigoni Tsiami, Panagiotis P. Filntisis, Niki Efthymiou, Petros Koutras, Gerasimos Potamianos, Petros Maragos

Citations: 17 • 2018

Analyzing Liquid Pouring Sequences via Audio-Visual Neural Networks

Justin Wilson, Auston Sterling, Ming C. Lin

Citations: 17 • 2019

Audio-visual keyword spotting based on adaptive decision fusion under noisy conditions for human-robot interaction

Hong Liu, Ting Fan, Pingping Wu

Citations: 16 • 2014