Physics-informed reward shaped reinforcement learning control of a robot manipulator
Raouf Fareh, Tanjulee Siddique, Kheireddine Choutri, Dmitry V. Dylov
- Year
- 2025
- Citations
- 4
Abstract
Reinforcement Learning (RL)-based control plays a pivotal role in developing adaptive robotic systems capable of optimal decision-making in dynamic and uncertain environments. Its ability to learn directly from interaction makes RL particularly effective for handling complex control tasks where traditional model-based approaches often fall short. However, conventional RL algorithms typically suffer from slow convergence rates and may lack stability guarantees, especially in continuous control applications. To address these limitations, this paper proposes a novel RL-based control framework for a 4-DOF robotic manipulator, leveraging an improved Physics-Informed Deep Deterministic Policy Gradient (PI-DDPG) agent for precise trajectory tracking. The proposed agent integrates Physics-Informed Neural Networks (PINNs) into the RL architecture, allowing it to exploit prior knowledge of the system's dynamics to significantly accelerate convergence. Furthermore, a Lyapunov-based reward shaping strategy is introduced to enhance the stability and reliability of the learning process without compromising optimality. An adaptive noise generation technique is also proposed to dynamically regulate exploration, improving the agent's ability to discover effective control actions. The effectiveness of the PI-DDPG framework is validated through both simulation and real-world experiments. Results show a 20–30% improvement in tracking accuracy and a threefold (x3) reduction in convergence time compared to baseline RL approaches, demonstrating the potential of physics-informed reinforcement learning for high-performance and robust robotic control.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002