首页 /研究 /Federated Reinforcement Learning for Controlling Multiple Rotary Inverted Pendulums in Edge Computing Environments

LEARNING

Federated Reinforcement Learning for Controlling Multiple Rotary Inverted Pendulums in Edge Computing Environments

Hyun-Kyo Lim, Ju-Bong Kim, Chan-Myung Kim, Gyu-Young Hwang, Ho-Bin Choi, Youn‐Hee Han

发表年份: 2020
引用次数: 22

摘要

Reinforcement learning has recently been studied in various fields and also used to optimally control real devices (e.g., robotic arms). In this paper, we try to allow multiple reinforcement learning agents to learn optimal control policy on their own devices of the same type but with slightly different dynamics. For such multiple devices, there is no guarantee that an agent who interacts only with one device and learns the optimal control policy will also control another device well. Therefore, we may need to apply independent reinforcement learning to each device individually, which requires time-consuming effort. To solve this problem, we propose a new federated reinforcement learning architecture where each agent working on its independent device shares their learning experience with each other, and transfers a mature policy model parameters into other agents. We incorporate the Actor-Critic PPO algorithm into each agent in the proposed collaborative architecture, and propose an efficient procedure for the gradient sharing and the model transfer. We also use edge computing to solve network problems that occur when training multiple real devices at the same time. Using multiple rotary inverted pendulum devices, we demonstrate that the proposed federated reinforcement learning scheme can effectively facilitate the learning process for multiple devices, and that the learning speed can be faster if more agents are involved.

关键词

Reinforcement learningComputer scienceInverted pendulumEnhanced Data Rates for GSM EvolutionProcess (computing)Edge deviceScheme (mathematics)Control (management)ArchitectureDistributed computing

Federated Reinforcement Learning for Controlling Multiple Rotary Inverted Pendulums in Edge Computing Environments

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory