Model Pruning Techniques for Boosting the Inference Efficiency on Embedded Systems
Zhongpeng Zhang
- Year
- 2021
- Citations
- 7
Abstract
Deep learning is one kind of machine learning schemes that leverages the multi-layer neural network to extract the features from the input data so as to make accurate estimation on new data. Along with the rapid growth of deep neural networks, the model is becoming more and more accuracy, but at the same time the parameter size and computation intensity also become a huge burden for inference, especially for embedded devices like cellphones robotics and autonomous vehicles. In order to solve this contradiction, experts propose distinguished pruning technologies to lighten the volume of the model and alleviate the storage imposition. To summarize the technologies, we compare them in different aspects clearly. We firstly describe the differences of state-of-the-art pruning schemes, and then give the direct comparison in terms of the accuracy and parameter size. Finally, we discuss the reasons of different performance results in order to suggest the software or hardware designers the best option when facing their specific use cases.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002