首页 /研究 /Memory-Computing Decoupling: A DNN Multitasking Accelerator With Adaptive Data Arrangement
LEARNING

Memory-Computing Decoupling: A DNN Multitasking Accelerator With Adaptive Data Arrangement

Chuxi Li, Xiaoya Fan, Xiaoti Wu, Yang Zhao, Miao Wang, Meng Zhang, Shengbing Zhang

发表年份
2022
引用次数
3

摘要

Multiple deep neural networks (DNNs) are increasingly used in real-world intelligent applications, such as intelligent robotics and autonomous vehicles to collectively complete complicated tasks running on edge devices. Because each layer of the subtasks prefers a distinct dataflow due to the heterogeneity in shape and scale of the network layers, a variable dataflow approach on the DNN accelerators is urgently required. On DNN accelerators that enable multiple dataflows, however, we detect a dimension mismatch between parallel processing under the dataflow approach and linear data memory arrangement. When multiple DNN tasks share partial features or weights, the issue is further exacerbated. During processing, this mismatch causes a sluggish data supply from both off-chip and on-chip memory. Consequently, the overall throughput, performance, and energy efficiency suffer since DNN models are sensitive to data density. In this work, we reveal the mechanism behind this data dimension mismatch and present a series of metrics that quantify the influence on system performance. On this foundation, we offer a framework that tracks the data tensor dimension conversion and employs a flexible data arrangement over multi-DNN computation to adapt to dataflow variability. An accelerator architecture named data arrangement multi-DNN accelerator (DARMA) that features a data arrangement and distribution circuit and hierarchical memory for data dimension conversion is also presented. Since the mismatch is mitigated, the suggested accelerator outperforms current accelerators in terms of bandwidth and processing unit utilization. Through tests on VR/AR, MLperf, and other multitask applications, the evaluation results show that the proposed architecture provides both energy-efficiency and throughput improvements.

关键词

DataflowComputer scienceArtificial neural networkDataflow architectureComputationComputer architectureParallel computingArtificial intelligenceComputer engineering

相关论文

查看 LEARNING 分类全部论文