Dynamic Regret Convergence Analysis and an Adaptive Regularization Algorithm for On-Policy Robot Imitation Learning
Jonathan N. Lee, Michael Laskey, Ajay Kumar Tanwani, Anil Aswani, Ken Goldberg
- 发表年份
- 2018
- 访问权限
- 开放获取
摘要
On-policy imitation learning algorithms such as DAgger evolve a robot control policy by executing it, measuring performance (loss), obtaining corrective feedback from a supervisor, and generating the next policy. As the loss between iterations can vary unpredictably, a fundamental question is under what conditions this process will eventually achieve a converged policy. If one assumes the underlying trajectory distribution is static (stationary), it is possible to prove convergence for DAgger. However, in more realistic models for robotics, the underlying trajectory distribution is dynamic because it is a function of the policy. Recent results show it is possible to prove convergence of DAgger when a regularity condition on the rate of change of the trajectory distributions is satisfied. In this article, we reframe this result using dynamic regret theory from the field of online optimization and show that dynamic regret can be applied to any on-policy algorithm to analyze its convergence and optimality. These results inspire a new algorithm, Adaptive On-Policy Regularization (AOR), that ensures the conditions for convergence. We present simulation results with cart-pole balancing and locomotion benchmarks that suggest AOR can significantly decrease dynamic regret and chattering as the robot learns. To our knowledge, this the first application of dynamic regret theory to imitation learning.
关键词
相关论文
基于非线性滑模模型预测控制与自适应跟随转向及动静态约束的六轮独立驱动/四轮独立转向无人地面车辆轨迹跟踪控制
Shengyang Lu, Guanpeng Chen, Lijing Zhao 等 5 位作者
Robotics and Autonomous Systems · 2026
仿生水下机器人:材料、设计、控制与应用进展
Dilip Muchhala, Pramod Kumar Maurya, Adarsh Raut 等 6 位作者
Robotics and Autonomous Systems · 2026
刚柔混合连杆人形机器人的建模与控制
Zewen He, Taiki Ishigaki, Ko Yamamoto
Robotics and Autonomous Systems · 2026
人-外骨骼-助行器系统的人工推动自适应协调控制
Xinhao Zhang, Chen Yang, Chaobin Zou 等 7 位作者
Robotics and Autonomous Systems · 2026