首页 /研究 /Two steps natural actor critic learning for underwater cable tracking
LEARNING

Two steps natural actor critic learning for underwater cable tracking

Andrés El-Fakdi, Marc Carreras, Enric Galceran

发表年份
2010
引用次数
7

摘要

This paper proposes a field application of a high-level Reinforcement Learning (RL) control system for solving the action selection problem of an autonomous robot in a cable tracking task. The underwater vehicle ICTINEU <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">AUV</sup> learns to perform a visual based cable tracking task in a two step learning process. First, a policy is computed by means of simulation where a hydrodynamic model of the vehicle simulates the cable following task. Once the simulated results are accurate enough, in a second step, the learned-in-simulation policy is transferred to the vehicle where the learning procedure continues in a real environment, improving the initial policy. The natural actor-critic (NAC) algorithm has been selected to solve the problem in both steps. This algorithm aims to take advantage of policy gradient and value function techniques for fast convergence. Actor's policy gradient gives convergence guarantees under function approximation and partial observability while critic's value function reduces variance of the estimates update improving the convergence process.

关键词

Reinforcement learningObservabilityConvergence (economics)Computer scienceBellman equationTask (project management)Function (biology)Process (computing)UnderwaterTracking (education)

相关论文

查看 LEARNING 分类全部论文