Reinforcement Learning with Dynamic Covering of State-Action Space: Partitioning Q-Learning
Rémi Munos, Jocelyn Patinel
- 发表年份
- 1994
- 引用次数
- 12
摘要
This paper presents a reinforcement learning algorithm : "Partitioning Q-learning", designed for generating an adaptive behavior of a reactive system with local perception in a complex and changing environment. This algorithm includes two dynamics : the learning dynamics based on the Q-learning and Bucket Brigade algorithms, and the structural dynamics that models the acquisition of the expert knowledge. The combination of these two dynamics intends to solve the problem of the combinatory explosion of the number of qualities to be estimated by dividing the state-action space into a minimal number of homogeneous regions using the formalism of Classifier Systems. This algorithm is applied to the simulation of a reactive robot which tries to cut weeds and to avoid plants in a cultivated field. / Article présentant un algorithme d'apprentissage par renforcement le "Q-learning", et fait le lien entre le Q-learning et les systèmes de classeurs. Cet algorithme est utilisé pour la simulation d'un robot réactif dans un champ cultivé.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002