SeRO: Self-Supervised Reinforcement Learning for Recovery from Out-of-Distribution Situations
Chan Kim, Jaekyung Cho, Christophe Bobda, Seung-Woo Seo, Seong-Woo Kim
- Year
- 2023
- Access
- Open access
Abstract
Robotic agents trained using reinforcement learning have the problem of taking unreliable actions in an out-of-distribution (OOD) state. Agents can easily become OOD in real-world environments because it is almost impossible for them to visit and learn the entire state space during training. Unfortunately, unreliable actions do not ensure that agents perform their original tasks successfully. Therefore, agents should be able to recognize whether they are in OOD states and learn how to return to the learned state distribution rather than continue to take unreliable actions. In this study, we propose a novel method for retraining agents to recover from OOD situations in a self-supervised manner when they fall into OOD states. Our in-depth experimental results demonstrate that our method substantially improves the agent's ability to recover from OOD situations in terms of sample efficiency and restoration of the performance for the original tasks. Moreover, we show that our method can retrain the agent to recover from OOD situations even when in-distribution states are difficult to visit through exploration.
Keywords
Related papers
Parallel Differentiable Reachability for Learning and Planning with Certified Neural Dynamics and Controllers
Keyi Shen, Glen Chou
2026
A deep reinforcement learning and a dynamic graph neural network-based scheduling agent to control a multi-task robot
Hedi Boukamcha, Anas Neumann, Monia Rekik +3 more
Robotics and Computer-Integrated Manufacturing · 2026
Artificial Intelligence enhanced smart welding islands: Foundation models revolutionizing manufacturing
Xiwei Wu, Wei Wu, Qiqi Chen +6 more
Robotics and Computer-Integrated Manufacturing · 2026
LLM Agent-driven Automated DFA Assessment with Fine-tuning and AAS-based RAG
Jiaxin Liu, Xiaofeng Zhou, Suyang Yu +5 more
Robotics and Computer-Integrated Manufacturing · 2026