首页 /研究 /The Online Coupon-Collector Problem and Its Application to Lifelong Reinforcement Learning

HRI

The Online Coupon-Collector Problem and Its Application to Lifelong Reinforcement Learning

Emma Brunskill, Lihong Li

发表年份: 2015
访问权限: 开放获取

摘要

Transferring knowledge across a sequence of related tasks is an important challenge in reinforcement learning (RL). Despite much encouraging empirical evidence, there has been little theoretical analysis. In this paper, we study a class of lifelong RL problems: the agent solves a sequence of tasks modeled as finite Markov decision processes (MDPs), each of which is from a finite set of MDPs with the same state/action sets and different transition/reward functions. Motivated by the need for cross-task exploration in lifelong learning, we formulate a novel online coupon-collector problem and give an optimal algorithm. This allows us to develop a new lifelong RL algorithm, whose overall sample complexity in a sequence of tasks is much smaller than single-task learning, even if the sequence of tasks is generated by an adversary. Benefits of the algorithm are demonstrated in simulated problems, including a recently introduced human-robot interaction problem.

关键词

cs.LGcs.AI

The Online Coupon-Collector Problem and Its Application to Lifelong Reinforcement Learning

摘要

关键词

相关论文

The Uncanny Valley [From the Field]

Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots

The development of Honda humanoid robot

A Meta-Analysis of Factors Affecting Trust in Human-Robot Interaction