首页 /研究 /Interpretable Policy Specification and Synthesis through Natural Language and RL

LEARNING

Interpretable Policy Specification and Synthesis through Natural Language and RL

Pradyumna Tambwekar, Andrew Silva, Nakul Gopalan, Matthew Gombolay

发表年份: 2021
引用次数: 4

摘要

Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96% and 97% accuracy on a held-out corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique.

关键词

Computer scienceReinforcement learningInterpretabilityNatural languageArtificial intelligenceLeverage (statistics)Machine learningInitializationUSableNatural language understanding

Interpretable Policy Specification and Synthesis through Natural Language and RL

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory