Home /Research /CNN inference: VLSI architecture for convolution layer for 1.2 TOPS
LEARNING

CNN inference: VLSI architecture for convolution layer for 1.2 TOPS

Mihir Mody, Manu Mathew, Shyam Jagannathan, A.J. Redfern, Jason Jones, Thorsten Lorenzen

Year
2017
Citations
6

Abstract

Deep Learning techniques like Convolutional Neural Networks (CNN) are getting popular for image classification with the broad usage spanning across automotive, industrial, medicine, robotics etc. Typical CNN network consists of multiple layers of convolutions, non-linearity, spatial pooling and fully connected layer, with 2D convolutions constituting more than 95% of overall computations. In this paper, we propose novel systolic and fully pipelined architecture for convolution layer which can scale to a high performance at a very low area. The architecture is based on innovative techniques namely vector outer product and intelligent data feeder to enable 3 levels of parallelism namely data values, outputs and inputs along with pipelining of compute elements with data movements. The proposed architecture is scalable to provide processing throughput of 64/256/512/1024 Multiplies and Add (MAC) per cycle. The architecture can run up to clock 600 MHz in low power 28 nm CMOS process node enabling performance of 1.2 Tera-Ops (TOPS).

Keywords

Computer scienceConvolution (computer science)Convolutional neural networkParallel computingScalabilityVery-large-scale integrationThroughputArtificial intelligenceNode (physics)Deep learning

Related papers

Browse all LEARNING papers