首页 /研究 /Inverse Reinforcement Learning Under Noisy Observations
LEARNING

Inverse Reinforcement Learning Under Noisy Observations

Shervin Shahryari, Prashant Doshi

发表年份
2017
访问权限
开放获取

摘要

We consider the problem of performing inverse reinforcement learning when the trajectory of the expert is not perfectly observed by the learner. Instead, a noisy continuous-time observation of the trajectory is provided to the learner. This problem exhibits wide-ranging applications and the specific application we consider here is the scenario in which the learner seeks to penetrate a perimeter patrolled by a robot. The learner's field of view is limited due to which it cannot observe the patroller's complete trajectory. Instead, we allow the learner to listen to the expert's movement sound, which it can also use to estimate the expert's state and action using an observation model. We treat the expert's state and action as hidden data and present an algorithm based on expectation maximization and maximum entropy principle to solve the non-linear, non-convex problem. Related work considers discrete-time observations and an observation model that does not include actions. In contrast, our technique takes expectations over both state and action of the expert, enabling learning even in the presence of extreme noise and broader applications.

关键词

cs.ROcs.AIcs.LG

相关论文

查看 LEARNING 分类全部论文