Home /Research /Interpretable Policy Specification and Synthesis through Natural Language and RL

LEARNING

Interpretable Policy Specification and Synthesis through Natural Language and RL

Pradyumna Tambwekar, Andrew Silva, Nakul Gopalan, Matthew Gombolay

Year: 2021
Citations: 4

Abstract

Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96% and 97% accuracy on a held-out corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique.

Keywords

Computer scienceReinforcement learningInterpretabilityNatural languageArtificial intelligenceLeverage (statistics)Machine learningInitializationUSableNatural language understanding

Interpretable Policy Specification and Synthesis through Natural Language and RL

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory