首页 /研究 /Model Pruning Techniques for Boosting the Inference Efficiency on Embedded Systems

LEARNING

Model Pruning Techniques for Boosting the Inference Efficiency on Embedded Systems

Zhongpeng Zhang

发表年份: 2021
引用次数: 7

摘要

Deep learning is one kind of machine learning schemes that leverages the multi-layer neural network to extract the features from the input data so as to make accurate estimation on new data. Along with the rapid growth of deep neural networks, the model is becoming more and more accuracy, but at the same time the parameter size and computation intensity also become a huge burden for inference, especially for embedded devices like cellphones robotics and autonomous vehicles. In order to solve this contradiction, experts propose distinguished pruning technologies to lighten the volume of the model and alleviate the storage imposition. To summarize the technologies, we compare them in different aspects clearly. We firstly describe the differences of state-of-the-art pruning schemes, and then give the direct comparison in terms of the accuracy and parameter size. Finally, we discuss the reasons of different performance results in order to suggest the software or hardware designers the best option when facing their specific use cases.

关键词

Computer sciencePruningArtificial intelligenceBoosting (machine learning)InferenceMachine learningComputationDeep learningArtificial neural networkDeep neural networks

Model Pruning Techniques for Boosting the Inference Efficiency on Embedded Systems

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory