Effects of Sampling and Prediction Horizon in Reinforcement Learning
Pavel Osinenko, Dmitrii Dobriborsci
- 发表年份
- 2021
- 引用次数
- 8
- 访问权限
- 开放获取
摘要
Plain reinforcement learning (RL) may be prone to loss of convergence, constraint violation, unexpected performance, etc. Commonly, RL agents undergo extensive learning stages to achieve acceptable functionality. This is in contrast to classical control algorithms which are typically model-based. An direction of research is the fusion of RL with such algorithms, especially model-predictive control (MPC). This, however, introduces new hyper-parameters related to the prediction horizon. Furthermore, RL is usually concerned with Markov decision processes. But the most of the real environments are not time-discrete. The factual physical setting of RL consists of a digital agent and a time-continuous dynamical system. There is thus, in fact, yet another hyper-parameter – the agent sampling time. In this paper, we investigate the effects of prediction horizon and sampling of two hybrid RL-MPC-agents in a case study with a mobile robot parking, which is in turn a canonical control problem. We benchmark the agents with a simple variant of MPC. The sampling showed a kind of a “sweet spot” behavior, whereas the RL agents demonstrated merits at shorter horizons.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991