首页 /研究 /Reinforcement Learning via Recurrent Convolutional Neural Networks
LEARNING

Reinforcement Learning via Recurrent Convolutional Neural Networks

Tanmay Shankar, Santosha K. Dwivedy, Prithwijit Guha

发表年份
2016
引用次数
6

摘要

Deep Reinforcement Learning has enabled the learning of policies for complex tasks in partially observable environments, without explicitly learning the underlying model of the tasks. While such model-free methods do achieve considerable performance, they often ignore the structure of task. We present a more natural representation of the solutions to Reinforcement Learning (RL) problems, within 3 Recurrent Convolutional Neural Network (RCNN) architectures to better exploit this inherent structure. The forward passes of each RCNN execute an efficient Value Iteration, propagate beliefs of state in partially observable environments, and choose optimal actions respectively. Applying back-propagation to these RCNNs allows the system to explicitly learn the Transition Model and Reward Function associated with the underlying MDP, serving as an elegant alternative to classical model-based RL. We evaluate the proposed algorithms in simulation, considering a robot planning problem. We demonstrate the capability of our framework to reduce the cost of re-planning, learn accurate MDP models, and finally re-plan with learned models to achieve near-optimal policies.

关键词

Reinforcement learningComputer scienceBellman equationExploitArtificial intelligenceFunction (biology)Representation (politics)Task (project management)ObservableConvolutional neural network

相关论文

查看 LEARNING 分类全部论文