Speeding up Deep Neural Networks on the Jetson TX1

Markus Eisenbach, Ronny Stricker, Daniel Seichter, Alexander Vorndran, Tim Wengefeld, Horst–Michael Groß

Year: 2018
Citations: 3
Access: Open access

Abstract

In recent years, Deep Learning (DL) showed new top performances in almost all computer vision tasks that are important for automotive and robotic applications. In these applications both space and power are limited resources. Therefore, there is a need to apply DL approaches on a small and power ecient device, like the NVIDIA Jetson TX1 with a powerful GPU onboard. In this paper, we analyze the Jetson's suitability by benchmarking the run-time of DL operations in comparison to a high performance GPU. Exemplary, we port a topperforming DL-based person detector to this platform. We explain the steps necessary to signicantly speed up this approach on the device.

Keywords

BenchmarkingComputer scienceAutomotive industryArtificial neural networkPort (circuit theory)Power (physics)Deep learningDetectorArtificial intelligenceEmbedded system

Speeding up Deep Neural Networks on the Jetson TX1

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory