Two steps natural actor critic learning for underwater cable tracking
Andrés El-Fakdi, Marc Carreras, Enric Galceran
- Year
- 2010
- Citations
- 7
Abstract
This paper proposes a field application of a high-level Reinforcement Learning (RL) control system for solving the action selection problem of an autonomous robot in a cable tracking task. The underwater vehicle ICTINEU <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">AUV</sup> learns to perform a visual based cable tracking task in a two step learning process. First, a policy is computed by means of simulation where a hydrodynamic model of the vehicle simulates the cable following task. Once the simulated results are accurate enough, in a second step, the learned-in-simulation policy is transferred to the vehicle where the learning procedure continues in a real environment, improving the initial policy. The natural actor-critic (NAC) algorithm has been selected to solve the problem in both steps. This algorithm aims to take advantage of policy gradient and value function techniques for fast convergence. Actor's policy gradient gives convergence guarantees under function approximation and partial observability while critic's value function reduces variance of the estimates update improving the convergence process.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002