Transformer Network for Multi-Person Tracking and Re-Identification in Unconstrained Environment
Hamza Mukhtar, Muhammad Usman Ghani Khan
- Year
- 2023
- Access
- Open access
Abstract
Multi-object tracking (MOT) has profound applications in a variety of fields, including surveillance, sports analytics, self-driving, and cooperative robotics. Despite considerable advancements, existing MOT methodologies tend to falter when faced with non-uniform movements, occlusions, and appearance-reappearance scenarios of the objects. Recognizing this inadequacy, we put forward an integrated MOT method that not only marries object detection and identity linkage within a singular, end-to-end trainable framework but also equips the model with the ability to maintain object identity links over long periods of time. Our proposed model, named STMMOT, is built around four key modules: 1) candidate proposal generation, which generates object proposals via a vision-transformer encoder-decoder architecture that detects the object from each frame in the video; 2) scale variant pyramid, a progressive pyramid structure to learn the self-scale and cross-scale similarities in multi-scale feature maps; 3) spatio-temporal memory encoder, extracting the essential information from the memory associated with each object under tracking; and 4) spatio-temporal memory decoder, simultaneously resolving the tasks of object detection and identity association for MOT. Our system leverages a robust spatio-temporal memory module that retains extensive historical observations and effectively encodes them using an attention-based aggregator. The uniqueness of STMMOT lies in representing objects as dynamic query embeddings that are updated continuously, which enables the prediction of object states with attention mechanisms and eradicates the need for post-processing.
Keywords
Related papers
How to Relieve Distribution Shifts in Semantic Segmentation for Off-Road Environments
Ji-Hoon Hwang, Daeyoung Kim, Hyung-Suk Yoon +2 more
2026
Uncertainty-guided evolvable recognition framework for industrial robots via prototype-based fuzzy inference and evidence fusion
Yanrun Zhou, Zihao Lei, Guangrui Wen +4 more
Robotics and Computer-Integrated Manufacturing · 2026
Point cloud registration for non-destructive, high-resolution coating thickness measurement from 3D scans
Simon Duenser, Ivo Aschwanden, Raamadaas Krishnadas +2 more
Robotics and Computer-Integrated Manufacturing · 2026
Toward the intelligent robotics era: Multimodal flexible haptic sensors for advanced perception systems
Sili Ding, Feng Xu, Jie Chen +3 more
Progress in Materials Science · 2026