Order Matters: On Parameter-Efficient Image-to-Video Probing for Recognizing Nearly Symmetric Actions
Thinesh Thiyakesan Ponbagavathi, Alina Roitberg
- 发表年份
- 2025
- 访问权限
- 开放获取
摘要
Fine-grained understanding of human actions is essential for safe and intuitive human--robot interaction. We study the challenge of recognizing nearly symmetric actions, such as picking up vs. placing down a tool or opening vs. closing a drawer. These actions are common in close human-robot collaboration, yet they are rare and largely overlooked in mainstream vision frameworks. Pretrained vision foundation models (VFMs) are often adapted using probing, valued in robotics for its efficiency and low data needs, or parameter-efficient fine-tuning (PEFT), which adds temporal modeling through adapters or prompts. However, our analysis shows that probing is permutation-invariant and blind to frame order, while PEFT is prone to overfitting on smaller HRI datasets, and less practical in real-world robotics due to compute constraints. To address this, we introduce STEP (Self-attentive Temporal Embedding Probing), a lightweight extension to probing that models temporal order via frame-wise positional encodings, a global CLS token, and a simplified attention block. Compared to conventional probing, STEP improves accuracy by 4--10% on nearly symmetric actions and 6--15% overall across action recognition benchmarks in human-robot-interaction, industrial assembly, and driver assistance. Beyond probing, STEP surpasses heavier PEFT methods and even outperforms fully fine-tuned models on all three benchmarks, establishing a new state-of-the-art. Code and models will be made publicly available: https://github.com/th-nesh/STEP.
关键词
相关论文
工业5.0中人机协作的多模态感知、互认知与具身执行综述与展望
Kai Ding, Qingyuan Mao, Yaqian Zhang 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
代理式人机协作:通过记忆实现上下文对齐
Jiahui Si, Wenchao Li, Xi Chen 等 7 位作者
Robotics and Computer-Integrated Manufacturing · 2026
迈向以人为中心的制造:人机协作装配中不确定性下的任务规划
Yingchao You, Ze Ji, Changyun Wei
Robotics and Computer-Integrated Manufacturing · 2026
自适应物理信息Transformer结合高斯过程残差补偿用于人机协作中的逆动力学建模
Rui Qian, Xi Zhang, Dongpeng Li 等 5 位作者
Robotics and Computer-Integrated Manufacturing · 2026