首页 /研究 /Automatic Audio Event Recognition Schemes for Context-Aware Audio Computing Devices
LEARNING

Automatic Audio Event Recognition Schemes for Context-Aware Audio Computing Devices

Shivam Soni, Sudipta Dey, M. Sabarimalai Manikandan

发表年份
2019
引用次数
14

摘要

Automatic audio event recognition (AER) plays a major role in designing and building intelligent location and context-aware applications including audio surveillance, audio indexing and content retrieval, highlight extraction, drone and robotic navigation, machine health monitoring, audio-aware voice processing services, and urban sound pollution monitoring. In this paper, we present audio event recognition (AER) schemes using the Mel-frequency cepstral coefficients (MFCC) and machine classifiers such as multi-class support vectors machines (MC-SVM), fully connected feed-forward neural networks (FCFFNNs), and one-dimensional convolutional neural networks (1D-CNNs) that are capable of automatically recognizing seven sound classes including aircraft, construction, music, nature (wind and rain), speech, vehicle, and train. In this study, we created large scale audio database for both training and testing purposes. The performance of the three AER schemes are evaluated under different audio frame sizes (100 ms, 250 ms and 500 ms) using a wide variety of sounds recorded using different kinds of recording devices. Results show that the FCFFNN and 1D-CNN based AER schemes had the F1-score values of 95.72% and 96.34% for audio frame size of 250 ms whereas MC-SVM based AER scheme had the F1-score value of 85.84%. The 1D-CNN based AER scheme had a class-wise accuracy is greater than 84% for audio frame size of 250 ms whereas the FC-FFNN based scheme had a class-wise accuracy is greater than 80% for audio frame size of 250 ms. The computational analysis results show that the prediction time of 1D-CNN based scheme is faster than the FC-FFNN based AER scheme.

关键词

Computer scienceMel-frequency cepstrumSpeech recognitionConvolutional neural networkSupport vector machineAudio miningContext (archaeology)Event (particle physics)Frame (networking)Artificial intelligence

相关论文

查看 LEARNING 分类全部论文