Multimodal Reinforcement Learning for Robots Collaborating with Humans

Afagh Mehri Shervedani, Siyu Li, Natawut Monaikul, Bahareh Abbasi, Miloš Žefran, Barbara Di Eugenio

发表年份: 2025
引用次数: 1
访问权限: 开放获取

摘要

Abstract Robot assistants for older adults and people with disabilities need to perform collaborative tasks with users effectively. The core component of these systems is an interaction manager whose job is to observe and assess the task and infer the state of the human and their intent for the robot to choose the best course of action. Due to the sparseness of the data in this domain, the policy for such multimodal systems is often crafted by hand; as the complexity of interactions grows, this process is not scalable. This paper proposes a reinforcement learning (RL) approach to automatically generate the multimodal policy of the robot. Our system focuses on a realistic scenario where a robot assists a user in locating objects within a home environment, managing multimodal signals, including language and physical actions, to select the best action. In contrast to traditional dialog systems, our agent is trained with a simulator that uses human data and can deal with multiple modalities. We use a simple high-level reward function that needs no fine-tuning and enforce some preconditions to speed up the training process. A human study evaluating the system in a real-world setting demonstrates promising results, indicating high usability and effective task completion. This RL-based approach offers a scalable and interpretable alternative for designing interaction managers in multimodal human-robot collaborations.

关键词

Reinforcement learningRoboticsArtificial intelligenceRobotHuman–computer interactionComputer science

Multimodal Reinforcement Learning for Robots Collaborating with Humans

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory