Pyramid Transformer: A Multi-size Object Detection Model with Limited Device Requirements for the Nursing Robot
Jiazheng Li, Jiexin Xie, Jiaxin Wang, Yujian Wen, Shijie Guo
- 发表年份
- 2022
- 引用次数
- 2
摘要
Multi-size object detection is a technical difficulty which impeding the development of the intelligent nursing robot. To cope with the problem, this paper proposes a Pyramid Transformer model to detect the objects with different sizes in nursing scenario. Pyramid Transformer consists of three parts including Transformer Module, Pyramid Structure and Convolution Module. Transformer Module can improve the performance of large object detection with Multi-head Attention mechanism, and Pyramid Structure enables the model to make prediction with feature maps of different sizes which benefits the detection of small objects. Convolution Module is employed to reduce hardware requirements, and it makes Pyramid Transformer could run and implement on a single graphics card. The experiments show that the mean average precision reaches 72.7% which makes improvement over other models. This shows that the proposed Pyramid Transformer model is practical and effective for object detection of the nursing robot. The dataset can be got at https://github.com/NotFar1997/NSI-dataset.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002