首页 /研究 /A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans

LEARNING

A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans

Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, Matthew E. Taylor

发表年份: 2016
引用次数: 33

摘要

As robots become pervasive in human environments, it is important to enable users to effectively convey new skills without programming. Most existing work on Interactive Reinforcement Learning focuses on interpreting and incorporating non-expert human feedback to speed up learning; we aim to design a better representation of the learning agent that is able to elicit more natural and effective communication between the human trainer and the learner, while treating human feedback as discrete communication that depends probabilistically on the trainer's target policy. This work entails a user study where participants train a virtual agent to accomplish tasks by giving reward and/or punishment in a variety of simulated environments. We present results from 60 participants to show how a learner can ground natural language commands and adapt its action execution speed to learn more efficiently from human trainers. The agent's action execution speed can be successfully modulated to encourage more explicit feedback from a human trainer in areas of the state space where there is high uncertainty. Our results show that our novel adaptive speed agent dominates different fixed speed agents on several measures of performance. Additionally, we investigate the impact of instructions on user performance and user preference in training conditions.

关键词

Computer scienceReinforcement learningTrainerHuman–computer interactionTask (project management)Action (physics)Artificial intelligenceRobot

A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory