首页 /研究 /Poster Abstract: Learning from Demonstrations with Temporal Logics
LEARNING

Poster Abstract: Learning from Demonstrations with Temporal Logics

Aniruddh G. Puranic, Jyotirmoy V. Deshmukh, Stefanos Nikolaidis

发表年份
2022
引用次数
3
访问权限
开放获取

摘要

Learning-from-demonstrations (LfD) is a popular paradigm to obtain effective robot control policies for complex tasks via reinforcement learning without the need to explicitly design reward functions. However, it is susceptible to imperfections in demonstrations and also raises concerns of safety and interpretability in the learned control policies. To address these issues, we propose to use Signal Temporal Logic (STL) to express high-level robotic tasks and use its quantitative semantics to evaluate and rank the quality of demonstrations. Temporal logic-based specifications allow us to create non-Markovian rewards, and are also capable of defining interesting causal dependencies between tasks such as sequential task specifications. We present our completed work that proposed LfD-STL framework that learns from even suboptimal/imperfect demonstrations and STL specifications to infer rewards for reinforcement learning tasks. We have validated our approach through various experimental setups to show how our method outperforms prior LfD methods. We then discuss future directions for tackling the problem of explainability and interpretability in such learning-based systems.

关键词

InterpretabilityComputer scienceReinforcement learningArtificial intelligenceSemantics (computer science)Task (project management)Temporal logicMachine learningRank (graph theory)Control (management)

相关论文

查看 LEARNING 分类全部论文