首页 /研究 /An Efficient Unified Approach Using Demonstrations for Inverse Reinforcement Learning

LEARNING

An Efficient Unified Approach Using Demonstrations for Inverse Reinforcement Learning

Maxwell Hwang, Wei‐Cheng Jiang, Yu-Jen Chen, Kao‐Shing Hwang, Yi-Chia Tseng

发表年份: 2019
引用次数: 5

摘要

A reinforcement learning (RF) agent is always equipped with a designed reward function to correct policies for optimal decision making through interactions with an environment. However, it is difficult to design a reward function appropriate for complex RF problems. To solve this difficulty, the inverse RF (IRL) is introduced to provide an efficient way to design a reward function based on input derived from knowledgeable experts. In the IRL, experts provide demonstrations so that the agents can imitate the behaviors accordingly. However, even incorrect demonstrations have merits, some of which are similar to correct ones, so as that the agents with these clues can endeavor to avoid the occurrence of that behavior. This article introduces an IRL method which considers two types of demonstrations, correct and incorrect, in function approximation of a reward function. Given the clues from two opposite demonstrations, agents can iteratively approximate a reward function that can guide them to like expert’s correct demonstrations and also, prevent them from making the same mistakes as the expert did. These incorrect demonstrations provide agents with some guidelines to avoid erroneous motions in the initial phase. Two simulated tasks, a labyrinth and robot soccer games are conducted to validate the proposed method. The simulation results show that the proposed method can achieve the objectives of generating an appropriate reward function to accomplish apprentice learning with an efficient learning time in IRL.

关键词

Computer scienceFunction (biology)Reinforcement learningArtificial intelligenceRobotMachine learning

An Efficient Unified Approach Using Demonstrations for Inverse Reinforcement Learning

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory