Home /Research /Robot to Human Interaction with multi-modal conversational engagement

HRI

Robot to Human Interaction with multi-modal conversational engagement

Giuseppe De Simone, Luca Greco, Alessia Saggese, Mario Vento

Year: 2025
Citations: 2

Abstract

In order to increase the social acceptability of social robots, it is important to foster more natural and effective interactions. To achieve this aim, the robot needs to be equipped with Multi-Engagement capabilities, so as to understand if and when people in the scene want to interact with the robot and to detect specifically who wants to interact. In order to face this challenge, in this paper we propose a multimodal architecture, exploiting both audio and video modalities, combining a deep learning based head pose estimator with an active speaker detector. The engagement is thus modeled through a Behavior Tree, which takes into account the history of the engagements with users and decides accordingly for an appropriate behavior of the robot. The proposed architecture has been integrated into a social robot within a ROS framework, running directly on board of an embedded NVIDIA Jetson device. Furthermore, the experimental results confirm that it is able to achieve an F-Score of 0.82 over the widely adopted UE-HRI dataset. Additionally, the system’s human-robot interaction capabilities were assessed by collecting surveys in real-world settings, which indicated a positive impact on user engagement and interaction quality.

Keywords

Human–robot interactionModalComputer scienceHuman–computer interactionRobotArtificial intelligence

Robot to Human Interaction with multi-modal conversational engagement

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory