Home /Research /A hardware accelerator to support deep learning processor units in real-time image processing
PERCEPTION

A hardware accelerator to support deep learning processor units in real-time image processing

Edoardo Cittadini, Mauro Marinoni, Giorgio Buttazzo

Year
2025
Citations
9

Abstract

Deep neural networks are becoming crucial in many cyber–physical systems involving complex perceptual tasks. For those embedded systems requiring real-time interactions with dynamic environments, as autonomous robots and drones, it is of paramount importance that such algorithms are efficiently executed onboard on properly designed hardware accelerators to meet the required performance specifications. In particular, some neural network architectures for object detection and tracking, as You Only Look Once (YOLO), include heavy computational stages that need to be executed before and after the model inference. Such stages are typically not incorporated in traditional accelerators and are executed on general-purpose processors, thus introducing a bottleneck in the overall processing pipeline. To overcome such a problem, this paper presents a general-purpose accelerator on a field-programmable gate array (FPGA) able to run pre-processing and post-processing operations typically required by vision tasks. The proposed solution has been tested in combination with a YOLO object detector accelerated on an Advanced Micro Devices (AMD) Xilinx Kria KR260 board mounting an UltraScale+ multiprocessor system-on-chip, achieving a significant improvement in terms of both timing performance and power consumption, and enabling onboard visual processing into drones. The proposed solution is able to boost the traditional object detection process by a factor of 4.4, allowing the execution of the full processing pipeline at 60 frames per second (fps), versus 13.6 fps reachable without the proposed accelerator. As a result, this work enables the use of high-speed cameras for developing more reactive systems that can respond to incoming events with lower latency.

Keywords

Computer scienceHardware accelerationImage processingComputer hardwareDeep learningArtificial intelligenceEmbedded systemComputer architectureImage (mathematics)Field-programmable gate array

Related papers

Browse all PERCEPTION papers