Home /Research /Model Pruning Techniques for Boosting the Inference Efficiency on Embedded Systems

LEARNING

Model Pruning Techniques for Boosting the Inference Efficiency on Embedded Systems

Zhongpeng Zhang

Year: 2021
Citations: 7

Abstract

Deep learning is one kind of machine learning schemes that leverages the multi-layer neural network to extract the features from the input data so as to make accurate estimation on new data. Along with the rapid growth of deep neural networks, the model is becoming more and more accuracy, but at the same time the parameter size and computation intensity also become a huge burden for inference, especially for embedded devices like cellphones robotics and autonomous vehicles. In order to solve this contradiction, experts propose distinguished pruning technologies to lighten the volume of the model and alleviate the storage imposition. To summarize the technologies, we compare them in different aspects clearly. We firstly describe the differences of state-of-the-art pruning schemes, and then give the direct comparison in terms of the accuracy and parameter size. Finally, we discuss the reasons of different performance results in order to suggest the software or hardware designers the best option when facing their specific use cases.

Keywords

Computer sciencePruningArtificial intelligenceBoosting (machine learning)InferenceMachine learningComputationDeep learningArtificial neural networkDeep neural networks

Model Pruning Techniques for Boosting the Inference Efficiency on Embedded Systems

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory