Path integral learning of multidimensional movement trajectories

João André, Cristina P. Santos, Lino Costa

发表年份: 2013
引用次数: 3
访问权限: 开放获取

摘要

This paper explores the use of Path Integral Methods, particularly several variants of the recent Path Integral Policy Improvement (PI2) algorithm in multidimensional movement parametrized policy learning. We rely on Dynamic Movement Primitives (DMPs) to codify discrete and rhythmic trajectories, and apply the PI2-CMA and PIBB methods in the learning of optimal policy parameters, according to different cost functions that inherently encode movement objectives. Additionally we merge both of these variants and propose the PIBB-CMA algorithm, comparing all of them with the vanilla version of PI2. From the obtained results we conclude that PIBB-CMA surpasses all other methods in terms of convergence speed and iterative final cost, which leads to an increased interest in its application to more complex robotic problems.

关键词

Merge (version control)Computer scienceConvergence (economics)Path (computing)ENCODEPath integral formulationArtificial intelligenceMovement (music)Mathematical optimizationMathematics

Path integral learning of multidimensional movement trajectories

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Fractional Differential Equations

Applied Nonlinear Control