Global structure of policy search spaces for reinforcement learning
Belinda Stapelberg, Katherine M. Malan
- 发表年份
- 2019
- 引用次数
- 3
摘要
Reinforcement learning is gaining prominence in the machine learning community. It dates back over three decades in areas such as cybernetics and psychology, but has more recently been applied widely in robotics, game playing and control systems. There are many approaches to reinforcement learning, most of which are based on the Markov decision process model. The goal of reinforcement learning is to learn the best strategy (referred to as a policy in reinforcement learning) of an agent interacting with its environment in order to reach a specified goal. Recently, evolutionary computation has been shown to be of benefit to reinforcement learning in some limited scenarios. Many studies have shown that the performance of evolutionary computation algorithms is influenced by the structure of the fitness landscapes of the problem being optimised. In this paper we investigate the global structure of the policy search spaces of simple reinforcement learning problems. The aim is to highlight structural characteristics that could influence the performance of evolutionary algorithms in a reinforcement learning context. Results indicate that the problems we investigated are characterised by enormous plateaus that form unimodal structures, resulting in a kind of needle-in-a-haystack global structure.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002