首页 /研究 /Lightweight Keyword Spotting with Inter-Domain Interaction and Attention for Real-Time Voice-Controlled Robotics
OTHER

Lightweight Keyword Spotting with Inter-Domain Interaction and Attention for Real-Time Voice-Controlled Robotics

Hien Pham, Thuy Phuong Vu, Huong Thi Nguyen, Minhhuy Le

发表年份
2025
引用次数
1
访问权限
开放获取

摘要

This study introduces a novel lightweight Keyword Spotting (KWS) model optimized for deployment on resource-constrained microcontrollers, with potential applications in robotic control and end-effector operations. The proposed model employs inter-domain interaction to effectively extract features from both Mel-frequency cepstral coefficients (MFCCs) and temporal audio characteristics, complemented by an attention mechanism to prioritize relevant audio segments for enhanced keyword detection. Achieving a 93.70% accuracy on the Google Command v2-12 commands dataset, the model outperforms existing benchmarks. It also demonstrates remarkable efficiency in inference speed (0.359 seconds) and resource utilization (34.9KB peak RAM and 98.7KB flash memory), offering a 3x faster inference time and reduced memory footprint compared to the DS-CNN-S model. These attributes make it particularly suitable for real-time voice command applications in low-power robotic systems, enabling intuitive and responsive control of robotic arms, end-effectors, and navigation systems. In this work, however, the KWS model is demonstrated in a simple non-destructive testing system for controlling sensor movement. This research lays the groundwork for advancing voice-activated robotic technologies on resource-limited hardware platforms.

关键词

Keyword spottingSpottingComputer scienceRoboticsDomain (mathematical analysis)Artificial intelligenceSpeech recognitionTime domainHuman–computer interactionComputer vision

相关论文

查看 OTHER 分类全部论文