首页 /研究 /Maximum Entropy Reinforcement Learning with Evolution Strategies
LEARNING

Maximum Entropy Reinforcement Learning with Evolution Strategies

Longxiang Shi, Shijian Li, Zheng Qian, Longbing Cao, Yang Long, Gang Pan

发表年份
2020
引用次数
6

摘要

Evolution strategies (ES) have recently raised attention in solving challenging tasks with low computation costs and high scalability. However, it is well-known that evolution strategies reinforcement learning (RL) methods suffer from low stability. Without careful consideration, ES methods are sensitive to local optima and are unstable in learning. Therefore, there is an urgent need for improving the stability of ES methods in solving RL problems. In this paper, we propose a simple yet efficient ES method to stabilize the learning. Specifically, we propose a framework to incorporate the maximum entropy reinforcement learning with evolution strategies and derive an efficient entropy calculation method for linear policies. We further present a practical algorithm called maximum entropy evolution policy search based on the proposed framework, which is efficient and stable for policy search in continuous control. Our algorithm shows high stability across different random seeds and can obtain comparable results in performance against some existing derivative-free RL methods on several of the well-known benchmark MuJoCo robotic control tasks.

关键词

Reinforcement learningComputer scienceScalabilityBenchmark (surveying)Entropy (arrow of time)ComputationMathematical optimizationPrinciple of maximum entropyStability (learning theory)Evolutionary computation

相关论文

查看 LEARNING 分类全部论文