Effective Multimodal Reinforcement Learning with Modality Alignment and Importance Enhancement
Jinming Ma, Feng Wu, Yingfeng Chen, Xianpeng Ji, Yu Ding
- Year
- 2023
- Access
- Open access
Abstract
Many real-world applications require an agent to make robust and deliberate decisions with multimodal information (e.g., robots with multi-sensory inputs). However, it is very challenging to train the agent via reinforcement learning (RL) due to the heterogeneity and dynamic importance of different modalities. Specifically, we observe that these issues make conventional RL methods difficult to learn a useful state representation in the end-to-end training with multimodal information. To address this, we propose a novel multimodal RL approach that can do multimodal alignment and importance enhancement according to their similarity and importance in terms of RL tasks respectively. By doing so, we are able to learn an effective state representation and consequentially improve the RL training process. We test our approach on several multimodal RL domains, showing that it outperforms state-of-the-art methods in terms of learning speed and policy quality.
Keywords
Related papers
Parallel Differentiable Reachability for Learning and Planning with Certified Neural Dynamics and Controllers
Keyi Shen, Glen Chou
2026
Artificial Intelligence enhanced smart welding islands: Foundation models revolutionizing manufacturing
Xiwei Wu, Wei Wu, Qiqi Chen +6 more
Robotics and Computer-Integrated Manufacturing · 2026
A deep reinforcement learning and a dynamic graph neural network-based scheduling agent to control a multi-task robot
Hedi Boukamcha, Anas Neumann, Monia Rekik +3 more
Robotics and Computer-Integrated Manufacturing · 2026
LLM Agent-driven Automated DFA Assessment with Fine-tuning and AAS-based RAG
Jiaxin Liu, Xiaofeng Zhou, Suyang Yu +5 more
Robotics and Computer-Integrated Manufacturing · 2026