Local-utopia policy selection for multi-objective reinforcement learning
Simone Parisi, Alexander Blank, Tobias Viernickel, Jan Peters
- Year
- 2016
- Citations
- 2
Abstract
Many real-world applications are characterized by multiple conflicting objectives. In such problems, optimality is replaced by Pareto optimality and the goal is to find the Pareto frontier, a set of solutions representing different compromises among the objectives. Despite recent advances in multi-objective optimization, the selection, given the Pareto frontier, of a Pareto-optimal policy is still an important problem, prominent in practical applications such as economics and robotics. In this paper, we present a versatile approach for selecting a policy from the Pareto frontier according to user-defined preferences. Exploiting a novel scalarization function and heuristics, our approach provides an easy-to-use and effective method for Pareto-optimal policy selection. Furthermore, the scalarization is applicable in multiple-policy learning strategies for approximating Pareto frontiers. To show the simplicity and effectiveness of our algorithm, we evaluate it on two problems and compare it to classical multi-objective reinforcement learning approaches.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002