A Portable Accelerator of Proximal Policy Optimization for Robots

Weiyi Zhang, Yancao Jiang, Fasih Ud Din Farrukh, Chun Zhang, Xiang Xie

发表年份: 2021
引用次数: 11

摘要

Reinforcement learning has great potential to solve robotic controlling tasks for different environments. Proximal policy optimization (PPO) is one of the most efficient algorithms of reinforcement learning, which implements three neural net-works during the training and inference. However, the practical applications of reinforcement learning algorithms in robots are limited by the computational complexity of the neural networks. An accelerator for PPO is proposed based on high level synthesis (HLS) in this work, which accelerates both training and inference. The proposed accelerator design is implemented on Ultra96-v2 FPGA board with Pynq system and is suitable for different robotic applications. The proposed accelerator design on programmable logic (PL) achieves 16.8x improvement in the training process and 12.4x in the inference compared to processing system (PS). Meanwhile, the proposed accelerator runs at a similar speed with PC while achieving a 93.3% power reduction.

关键词

Reinforcement learningComputer scienceField-programmable gate arrayInferenceRobotArtificial neural networkProcess (computing)Artificial intelligenceReduction (mathematics)Embedded system

A Portable Accelerator of Proximal Policy Optimization for Robots

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory