A Portable Accelerator of Proximal Policy Optimization for Robots
Weiyi Zhang, Yancao Jiang, Fasih Ud Din Farrukh, Chun Zhang, Xiang Xie
- 发表年份
- 2021
- 引用次数
- 11
摘要
Reinforcement learning has great potential to solve robotic controlling tasks for different environments. Proximal policy optimization (PPO) is one of the most efficient algorithms of reinforcement learning, which implements three neural net-works during the training and inference. However, the practical applications of reinforcement learning algorithms in robots are limited by the computational complexity of the neural networks. An accelerator for PPO is proposed based on high level synthesis (HLS) in this work, which accelerates both training and inference. The proposed accelerator design is implemented on Ultra96-v2 FPGA board with Pynq system and is suitable for different robotic applications. The proposed accelerator design on programmable logic (PL) achieves 16.8x improvement in the training process and 12.4x in the inference compared to processing system (PS). Meanwhile, the proposed accelerator runs at a similar speed with PC while achieving a 93.3% power reduction.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002