Inverse Reinforcement Learning Under Noisy Observations
Shervin Shahryari, Prashant Doshi
- 发表年份
- 2017
- 访问权限
- 开放获取
摘要
We consider the problem of performing inverse reinforcement learning when the trajectory of the expert is not perfectly observed by the learner. Instead, a noisy continuous-time observation of the trajectory is provided to the learner. This problem exhibits wide-ranging applications and the specific application we consider here is the scenario in which the learner seeks to penetrate a perimeter patrolled by a robot. The learner's field of view is limited due to which it cannot observe the patroller's complete trajectory. Instead, we allow the learner to listen to the expert's movement sound, which it can also use to estimate the expert's state and action using an observation model. We treat the expert's state and action as hidden data and present an algorithm based on expectation maximization and maximum entropy principle to solve the non-linear, non-convex problem. Related work considers discrete-time observations and an observation model that does not include actions. In contrast, our technique takes expectations over both state and action of the expert, enabling learning even in the presence of extreme noise and broader applications.
关键词
相关论文
面向学习与规划的并行可微可达性:具有认证神经动力学与控制器的系统
Keyi Shen, Glen Chou
2026
人工智能增强的智能焊接岛:基础模型革新制造业
Xiwei Wu, Wei Wu, Qiqi Chen 等 9 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于深度强化学习和动态图神经网络的多任务机器人调度代理
Hedi Boukamcha, Anas Neumann, Monia Rekik 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于微调与AAS增强检索的LLM驱动自动化DFA评估
Jiaxin Liu, Xiaofeng Zhou, Suyang Yu 等 8 位作者
Robotics and Computer-Integrated Manufacturing · 2026