EventTransAct: A video transformer-based framework for Event-camera based action recognition
Tristan de Blegiers, Ishan Rajendrakumar Dave, Adeel Yousaf, Mubarak Shah
- Year
- 2023
- Access
- Open access
Abstract
Recognizing and comprehending human actions and gestures is a crucial perception requirement for robots to interact with humans and carry out tasks in diverse domains, including service robotics, healthcare, and manufacturing. Event cameras, with their ability to capture fast-moving objects at a high temporal resolution, offer new opportunities compared to standard action recognition in RGB videos. However, previous research on event camera action recognition has primarily focused on sensor-specific network architectures and image encoding, which may not be suitable for new sensors and limit the use of recent advancements in transformer-based architectures. In this study, we employ a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame and then utilizes a temporal self-attention mechanism. In order to better adopt the VTN for the sparse and fine-grained nature of event data, we design Event-Contrastive Loss ($\mathcal{L}_{EC}$) and event-specific augmentations. Proposed $\mathcal{L}_{EC}$ promotes learning fine-grained spatial cues in the spatial backbone of VTN by contrasting temporally misaligned frames. We evaluate our method on real-world action recognition of N-EPIC Kitchens dataset, and achieve state-of-the-art results on both protocols - testing in seen kitchen (\textbf{74.9\%} accuracy) and testing in unseen kitchens (\textbf{42.43\% and 46.66\% Accuracy}). Our approach also takes less computation time compared to competitive prior approaches, which demonstrates the potential of our framework \textit{EventTransAct} for real-world applications of event-camera based action recognition. Project Page: \url{https://tristandb8.github.io/EventTransAct_webpage/}
Keywords
Related papers
How to Relieve Distribution Shifts in Semantic Segmentation for Off-Road Environments
Ji-Hoon Hwang, Daeyoung Kim, Hyung-Suk Yoon +2 more
2026
Uncertainty-guided evolvable recognition framework for industrial robots via prototype-based fuzzy inference and evidence fusion
Yanrun Zhou, Zihao Lei, Guangrui Wen +4 more
Robotics and Computer-Integrated Manufacturing · 2026
Point cloud registration for non-destructive, high-resolution coating thickness measurement from 3D scans
Simon Duenser, Ivo Aschwanden, Raamadaas Krishnadas +2 more
Robotics and Computer-Integrated Manufacturing · 2026
Toward the intelligent robotics era: Multimodal flexible haptic sensors for advanced perception systems
Sili Ding, Feng Xu, Jie Chen +3 more
Progress in Materials Science · 2026