Poster Abstract: Learning from Demonstrations with Temporal Logics

Aniruddh G. Puranic, Jyotirmoy V. Deshmukh, Stefanos Nikolaidis

发表年份: 2022
引用次数: 3
访问权限: 开放获取

摘要

Learning-from-demonstrations (LfD) is a popular paradigm to obtain effective robot control policies for complex tasks via reinforcement learning without the need to explicitly design reward functions. However, it is susceptible to imperfections in demonstrations and also raises concerns of safety and interpretability in the learned control policies. To address these issues, we propose to use Signal Temporal Logic (STL) to express high-level robotic tasks and use its quantitative semantics to evaluate and rank the quality of demonstrations. Temporal logic-based specifications allow us to create non-Markovian rewards, and are also capable of defining interesting causal dependencies between tasks such as sequential task specifications. We present our completed work that proposed LfD-STL framework that learns from even suboptimal/imperfect demonstrations and STL specifications to infer rewards for reinforcement learning tasks. We have validated our approach through various experimental setups to show how our method outperforms prior LfD methods. We then discuss future directions for tackling the problem of explainability and interpretability in such learning-based systems.

关键词

InterpretabilityComputer scienceReinforcement learningArtificial intelligenceSemantics (computer science)Task (project management)Temporal logicMachine learningRank (graph theory)Control (management)

Poster Abstract: Learning from Demonstrations with Temporal Logics

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory