首页 /研究 /A Dynamical System-based Approach to Modeling Stable Robot Control Policies via Imitation Learning
LEARNING

A Dynamical System-based Approach to Modeling Stable Robot Control Policies via Imitation Learning

Khansari Zadeh, Seyed Mohammad

发表年份
2012
引用次数
18
访问权限
开放获取

摘要

Despite tremendous advances in robotics, we are still amazed by the proficiency with which humans perform movements. Even new waves of robotic systems still rely heavily on hardcoded motions with a limited ability to react autonomously and robustly to a dynamically changing environment. This thesis focuses on providing possible mechanisms to push the level of adaptivity, reactivity, and robustness of robotic systems closer to human movements. Specifically, it aims at developing these mechanisms for a subclass of robot motions called “reaching movements”, i.e. movements in space stopping at a given target (also referred to as episodic motions, discrete motions, or point-to-point motions). These reaching movements can then be used as building blocks to form more advanced robot tasks. To achieve a high level of proficiency as described above, this thesis particularly seeks to derive control policies that: 1) resemble human motions, 2) guarantee the accomplishment of the task (if the target is reachable), and 3) can instantly adapt to changes in dynamic environments. To avoid manually hardcoding robot motions, this thesis exploits the power of machine learning techniques and takes an Imitation Learning (IL) approach to build a generic model of robot movements from a few examples provided by an expert. To achieve the required level of robustness and reactivity, the perspective adopted in this thesis is that a reaching movement can be described with a nonlinear Dynamical System (DS). When building an estimate of DS from demonstrations, there are two key problems that need to be addressed: the problem of generating motions that resemble at best the demonstrations (the “how-to-imitate” problem), and most importantly, the problem of ensuring the accomplishment of the task, i.e. reaching the target (the “stability” problem). Although there are numerous well-established approaches in robotics that could answer each of these problems separately, tackling both problems simultaneously is challenging and has not been extensively studied yet. This thesis first tackles the problem mentioned above by introducing an iterative method to build an estimate of autonomous nonlinear DS that are formulated as a mixture of Gaussian functions. This method minimizes the number of Gaussian functions required for achieving both local asymptotic stability at the target and accuracy in following demonstrations. We then extend this formulation and provide sufficient conditions to ensure global asymptotic stability of autonomous DS at the target. In this approach, an estimation of the underlying DS is built by solving a constraint optimization problem, where the metric of accuracy and the stability conditions are formulated as the optimization objective and constraints, respectively. In addition to ensuring convergence of all motions to the target within the local or global stability regions, these approaches offer an inherent adaptability and robustness to changes in dynamic environments. This thesis further extends the previous approaches and ensures global asymptotic stability of DS-based motions at the target independently of the choice of the regression technique. Therefore, it offers the possibility to choose the most appropriate regression technique based on the requirements of the task at hand without compromising DS stability. This approach also provides the possibility of online learning and using a combination of two or more regression methods to model more advanced robot tasks, and can be applied to estimate motions that are represented with both autonomous and non-autonomous DS. Additionally, this thesis suggests a reformulation to modeling robot motions that allows encoding of a considerably wider set of tasks ranging from reaching movements to agile robot movements that require hitting a given target with a specific speed and direction. This approach is validated in the context of playing the challenging task of minigolf. Finally, the last part o

关键词

Robustness (evolution)AdaptabilityObstacleComputer scienceImitationArtificial intelligencePolitical sciencePsychologyManagement

相关论文

查看 LEARNING 分类全部论文