Home /Research /Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback
LEARNING

Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback

Guangliang Li, Bo He, Randy Gómez, Keisuke Nakamura

Year
2018
Citations
19

Abstract

Programing robots to perform tasks is difficult in the real world because of its richness and uncertainty. For robots and agents to be more useful, they must be able to learn quickly from ordinary people via natural interactions. In this paper, we investigate how an agent can learn from demonstration and positive and negative evaluative feedback provided by a human teacher. Specifically, we proposed a model-based method-IRL-TAMER-by combining learning from demonstration via inverse reinforcement learning (IRL) and learning from human reward via the TAMER framework. We tested our method in the Grid World domain and compared with the TAMER framework using different discount factors on human reward. Our results suggest that although an agent learning via IRL can learn a useful value function indicating which states are good based on the demonstration, it cannot obtain an effective policy navigating to the goal state with one demonstration. However, learning from demonstration can reduce the number of human reward needed to obtain an optimal policy, especially the number of negative feedback. That is to say, learning from demonstration can be a jump-start for agent's learning from human reward and reduce the number of mistakes-incorrect actions. Furthermore, our results show that learning from demonstration can only be useful for agent's learning from human reward when the discount factor is small, i.e., learning from myopic human reward.

Keywords

Reinforcement learningComputer scienceArtificial intelligenceFunction (biology)RobotDomain (mathematical analysis)Active learning (machine learning)Machine learningHuman–computer interactionMathematics

Related papers

Browse all LEARNING papers