Robust Sound Classification for Surveillance using Time Frequency Audio Features

Submitted by grigby1 on Fri, 12/11/2020 - 2:43pm

Title	Robust Sound Classification for Surveillance using Time Frequency Audio Features
Publication Type	Conference Paper
Year of Publication	2019
Authors	Hassan, S. U., Khan, M. Zeeshan, Khan, M. U. Ghani, Saleem, S.
Conference Name	2019 International Conference on Communication Technologies (ComTech)
Date Published	March 2019
Publisher	IEEE
ISBN Number	978-1-5386-5106-3
Keywords	Acoustic signal processing, ambiguous behaviour, audio events, audio signal processing, auditory classification, convolution, convolution neural network, convolutional neural nets, ESC-50 datasets, feature extraction, learning (artificial intelligence), Mel frequency cepstral coefficient, Mel Spectrogram, Metrics, MFCC, Neural networks, outlier, perception, pubcrawl, resilience, Resiliency, robust sound classification, Scalability, security, security alerts, security of data, self-generated dataset, signal based features, signal characteristics, signal classification, Sound, sound data, Spectrogram, surveillance, Time Frequency Analysis, time frequency audio features, time series, time-based features, Time-frequency Analysis, unusual activity, Videos
Abstract	Over the years, technology has reformed the perception of the world related to security concerns. To tackle security problems, we proposed a system capable of detecting security alerts. System encompass audio events that occur as an outlier against background of unusual activity. This ambiguous behaviour can be handled by auditory classification. In this paper, we have discussed two techniques of extracting features from sound data including: time-based and signal based features. In first technique, we preserve time-series nature of sound, while in other signal characteristics are focused. Convolution neural network is applied for categorization of sound. Major aim of research is security challenges, so we have generated data related to surveillance in addition to available datasets such as UrbanSound 8k and ESC-50 datasets. We have achieved 94.6% accuracy for proposed methodology based on self-generated dataset. Improved accuracy on locally prepared dataset demonstrates novelty in research.
URL	https://ieeexplore.ieee.org/document/8737801
DOI	10.1109/COMTECH.2019.8737801
Citation Key	hassan_robust_2019

Groups:

Science of Security VO