Maximum Causal Entropy Specification Inference from Demonstrations
Marcell Vazquez-Chanlatte, Sanjit A. Seshia
- Year
- 2019
- Access
- Open access
Abstract
In many settings (e.g., robotics) demonstrations provide a natural way to specify tasks; however, most methods for learning from demonstrations either do not provide guarantees that the artifacts learned for the tasks, such as rewards or policies, can be safely composed and/or do not explicitly capture history dependencies. Motivated by this deficit, recent works have proposed learning Boolean task specifications, a class of Boolean non-Markovian rewards which admit well-defined composition and explicitly handle historical dependencies. This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations. The key algorithmic insight is to leverage the extensive literature and tooling on reduced ordered binary decision diagrams to efficiently encode a time unrolled Markov Decision Process. This enables transforming a naive exponential time algorithm into a polynomial time algorithm.
Keywords
Related papers
The Organization of Behavior
D. O. Hebb
2005
Fractional Brownian Motions, Fractional Noises and Applications
Benoît B. Mandelbrot, John W. Van Ness
1968
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
Laith Alzubaidi, Jinglan Zhang, Amjad J. Humaidi +7 more
2021
A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar +7 more
2018