An Energy-Efficient Deep Reinforcement Learning FPGA Accelerator for Online Fast Adaptation with Selective Mixed-precision Re-training
Wooyoung Jo, Juhyoung Lee, Seunghyun Park, Hoi‐Jun Yoo
- Year
- 2021
- Citations
- 3
Abstract
Recently, deep reinforcement learning (DRL) has shown human-level performances in sequential decision-making problems including a gaming agent and robot control [1]. Especially, DRL supports autonomous adaptation of edge devices to unknown environments thanks to its distinct characteristics. Fig. 1 shows the basic components of the DRL system. It consists of a DRL agent, a replay buffer, and the environment. Unlike traditional deep learning which requires labeled data, DRL training utilizes experiences stored in the replay buffer. The stored experiences are generated by repetitive interaction between the DRL agent and the environment. This trial-and-error-based training method enables the agent to adapt to sudden environmental changes.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002