首页 /研究 /Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing

LEARNING

Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing

Maciej Wielgosz, Michał Karwatowski

发表年份: 2019
引用次数: 28
访问权限: 开放获取

摘要

Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the answer. In this paper, we propose a methodology, a set of predefined steps to be taken in order to map the models to hardware, especially field programmable gate arrays (FPGAs), with the main focus on latency reduction. Multi-objective covariance matrix adaptation evolution strategy (MO-CMA-ES) was employed along with custom scores for sparsity, bit-width of the representation and quality of the model. Furthermore, we created a framework which enables mapping of neural models to FPGAs. The proposed solution is validated using three case studies and Xilinx Zynq UltraScale+ MPSoC 285 XCZU15EG as a platform. The results show a compression ratio for quantization and pruning in different scenarios with and without retraining procedures. Using our publicly available framework, we achieved 210 ns of latency for a single processing step for a model composed of two long short-term memory (LSTM) and a single dense layer.

关键词

Computer scienceField-programmable gate arrayLatency (audio)VirtexArtificial intelligenceEmbedded systemComputer engineeringComputer architectureReal-time computing

Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory