Home /Research /Temporal Pyramid Alignment and Adaptive Fusion of Event Stream and Image Frame for Keypoint Detection and Tracking in Autonomous Driving

PERCEPTION

Temporal Pyramid Alignment and Adaptive Fusion of Event Stream and Image Frame for Keypoint Detection and Tracking in Autonomous Driving

P. Shi, Chee‐Onn Chow, Wei Ru Wong

Year: 2025
Citations: 2

Abstract

This paper proposes a method to address the alignment and fusion challenges in multimodal fusion between event and RGB cameras. For multimodal alignment, we adopt the Temporal Pyramid Alignment mechanism to achieve multi-scale temporal synchronization of event streams and RGB frames. For multimodal fusion, we design a module that employs adaptive fusion to dynamically adjust the contribution of each modality based on scene complexity and feature quality. A gating network computes fusion weights by considering both relative modality importance and noise characteristics. A Cross-Modal Feature Compensation module is integrated into the framework to enhance information utilization. Additionally, the framework incorporates a Dynamic Inference Path Selection mechanism, guided by input complexity, to optimize computational resource allocation, along with a dynamic noise suppression mechanism to improve the robustness of feature extraction. Experimental results on the DSEC dataset demonstrate that the proposed method achieves a 36.9% mAP and 40.1% tracking success rate, particularly effective in extreme lighting and fast motion scenarios, surpassing existing approaches by 1.8% mAP and 1.6% SR, while maintaining real-time efficiency at 13.1 FPS. This work provides an important solution for applications in autonomous driving, robotics, and augmented reality, where robust multimodal perception under dynamic conditions is critical.

Keywords

Computer visionPyramid (geometry)Frame (networking)Artificial intelligenceComputer scienceTracking (education)Event (particle physics)FusionImage (mathematics)Mathematics

Temporal Pyramid Alignment and Adaptive Fusion of Event Stream and Image Frame for Keypoint Detection and Tracking in Autonomous Driving

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Fractional Differential Equations

Applied Nonlinear Control