首页 /研究 /Automatic Audio Event Recognition Schemes for Context-Aware Audio Computing Devices

LEARNING

Automatic Audio Event Recognition Schemes for Context-Aware Audio Computing Devices

Shivam Soni, Sudipta Dey, M. Sabarimalai Manikandan

发表年份: 2019
引用次数: 14

摘要

Automatic audio event recognition (AER) plays a major role in designing and building intelligent location and context-aware applications including audio surveillance, audio indexing and content retrieval, highlight extraction, drone and robotic navigation, machine health monitoring, audio-aware voice processing services, and urban sound pollution monitoring. In this paper, we present audio event recognition (AER) schemes using the Mel-frequency cepstral coefficients (MFCC) and machine classifiers such as multi-class support vectors machines (MC-SVM), fully connected feed-forward neural networks (FCFFNNs), and one-dimensional convolutional neural networks (1D-CNNs) that are capable of automatically recognizing seven sound classes including aircraft, construction, music, nature (wind and rain), speech, vehicle, and train. In this study, we created large scale audio database for both training and testing purposes. The performance of the three AER schemes are evaluated under different audio frame sizes (100 ms, 250 ms and 500 ms) using a wide variety of sounds recorded using different kinds of recording devices. Results show that the FCFFNN and 1D-CNN based AER schemes had the F1-score values of 95.72% and 96.34% for audio frame size of 250 ms whereas MC-SVM based AER scheme had the F1-score value of 85.84%. The 1D-CNN based AER scheme had a class-wise accuracy is greater than 84% for audio frame size of 250 ms whereas the FC-FFNN based scheme had a class-wise accuracy is greater than 80% for audio frame size of 250 ms. The computational analysis results show that the prediction time of 1D-CNN based scheme is faster than the FC-FFNN based AER scheme.

关键词

Computer scienceMel-frequency cepstrumSpeech recognitionConvolutional neural networkSupport vector machineAudio miningContext (archaeology)Event (particle physics)Frame (networking)Artificial intelligence

Automatic Audio Event Recognition Schemes for Context-Aware Audio Computing Devices

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory