Rethinking Cooking State Recognition with Vision Transformers
Akib Mohammed Khan, Alif Ashrafee, Reeshoon Sayera, Shahriar Ivan, Sabbir Ahmed
- 发表年份
- 2022
- 访问权限
- 开放获取
摘要
To ensure proper knowledge representation of the kitchen environment, it is vital for kitchen robots to recognize the states of the food items that are being cooked. Although the domain of object detection and recognition has been extensively studied, the task of object state classification has remained relatively unexplored. The high intra-class similarity of ingredients during different states of cooking makes the task even more challenging. Researchers have proposed adopting Deep Learning based strategies in recent times, however, they are yet to achieve high performance. In this study, we utilized the self-attention mechanism of the Vision Transformer (ViT) architecture for the Cooking State Recognition task. The proposed approach encapsulates the globally salient features from images, while also exploiting the weights learned from a larger dataset. This global attention allows the model to withstand the similarities between samples of different cooking objects, while the employment of transfer learning helps to overcome the lack of inductive bias by utilizing pretrained weights. To improve recognition accuracy, several augmentation techniques have been employed as well. Evaluation of our proposed framework on the `Cooking State Recognition Challenge Dataset' has achieved an accuracy of 94.3%, which significantly outperforms the state-of-the-art.
关键词
相关论文
如何缓解越野环境中语义分割的分布偏移
Ji-Hoon Hwang, Daeyoung Kim, Hyung-Suk Yoon 等 5 位作者
2026
基于原型模糊推理与证据融合的不确定性引导工业机器人可进化识别框架
Yanrun Zhou, Zihao Lei, Guangrui Wen 等 7 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于点云配准的非破坏性高分辨率涂层厚度三维扫描测量
Simon Duenser, Ivo Aschwanden, Raamadaas Krishnadas 等 5 位作者
Robotics and Computer-Integrated Manufacturing · 2026
迈向智能机器人时代:用于高级感知系统的多模态柔性触觉传感器
Sili Ding, Feng Xu, Jie Chen 等 6 位作者
Progress in Materials Science · 2026