Home /Research /Enhancing robotic skill acquisition with multimodal sensory data: A novel dataset for kitchen tasks
HRI

Enhancing robotic skill acquisition with multimodal sensory data: A novel dataset for kitchen tasks

Ruochen Ren, Zhipeng Wang, Chaoyun Yang, Jiahang Liu, Rong Jiang, Yanmin Zhou, Shuo Jiang, Bin He

Year
2025
Citations
5
Access
Open access

Abstract

The advent of large language models has transformed human-robot interaction by enabling robots to execute tasks via natural language commands. However, these models primarily depend on unimodal data, which limits their ability to integrate diverse and essential environmental, physiological, and physical data. To address the limitations of current unimodal dataset problems, this paper investigates the novel and comprehensive multimodal data collection methodologies which can fully capture the complexity of human interaction in the complex real-world kitchen environments. Data related to the use of 17 different kitchen tools by 20 adults in dynamic scenarios were collected, including human tactile information, EMG signals, audio data, whole-body movement, and eye-tracking data. The dataset is comprised of 680 segments (~11 hours) with data across seven modalities and includes 56,000 detailed annotations. This paper bridges the gap between real-world multimodal data and embodied AI, paving the way for a new benchmark in utility and repeatability for skill learning in robotics areas.

Keywords

Sensory systemComputer scienceDreyfus model of skill acquisitionHuman–computer interactionArtificial intelligenceCognitive psychologyPsychology

Related papers

Browse all HRI papers