Merging Reinforcement Learning and Inverse Reinforcement Learning via Auxiliary Reward System
Wadhah Zeyad Tareq Tareq, Mehmet Fatih Amasyalı
- Year
- 2022
- Citations
- 6
Abstract
In recent years, learning from demonstration has become one of the promising methods in robotics and interactive systems. Learning from demonstration is a model by which an agent learns by observing an expert. The expert could be a pre-trained agent or human. The main problem with learning from demonstrations is the difference between the reward representation in the demonstrations and the actual environment. During the construction of the demonstrations, it is easy to add new rewards to enhancement the agent’s performance. In contrast, it is not easy to do that in an actual environment. This work is built upon our previous work to solve this problem. In previous work, the agent uses Reinforcement Learning algorithms to learn how to play video games from demonstrations. The agent was supplied with an external reward to solve the problem of missing rewards in the hard exploration environments. In this work, Inverse Reinforcement Learning uses to extract the external rewards from the demonstration and make them available during the interaction period. The results showed that inverse learning enables the agent to interact with the environment after the pre-training. Furthermore, the performance of the agent becomes more stable.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002