Reinforcement Learning with Dynamic Covering of State-Action Space: Partitioning Q-Learning
Rémi Munos, Jocelyn Patinel
- Year
- 1994
- Citations
- 12
Abstract
This paper presents a reinforcement learning algorithm : "Partitioning Q-learning", designed for generating an adaptive behavior of a reactive system with local perception in a complex and changing environment. This algorithm includes two dynamics : the learning dynamics based on the Q-learning and Bucket Brigade algorithms, and the structural dynamics that models the acquisition of the expert knowledge. The combination of these two dynamics intends to solve the problem of the combinatory explosion of the number of qualities to be estimated by dividing the state-action space into a minimal number of homogeneous regions using the formalism of Classifier Systems. This algorithm is applied to the simulation of a reactive robot which tries to cut weeds and to avoid plants in a cultivated field. / Article présentant un algorithme d'apprentissage par renforcement le "Q-learning", et fait le lien entre le Q-learning et les systèmes de classeurs. Cet algorithme est utilisé pour la simulation d'un robot réactif dans un champ cultivé.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002