Visible to the public Robust Sound Classification for Surveillance using Time Frequency Audio Features

TitleRobust Sound Classification for Surveillance using Time Frequency Audio Features
Publication TypeConference Paper
Year of Publication2019
AuthorsHassan, S. U., Khan, M. Zeeshan, Khan, M. U. Ghani, Saleem, S.
Conference Name2019 International Conference on Communication Technologies (ComTech)
Date PublishedMarch 2019
PublisherIEEE
ISBN Number978-1-5386-5106-3
KeywordsAcoustic signal processing, ambiguous behaviour, audio events, audio signal processing, auditory classification, convolution, convolution neural network, convolutional neural nets, ESC-50 datasets, feature extraction, learning (artificial intelligence), Mel frequency cepstral coefficient, Mel Spectrogram, Metrics, MFCC, Neural networks, outlier, perception, pubcrawl, resilience, Resiliency, robust sound classification, Scalability, security, security alerts, security of data, self-generated dataset, signal based features, signal characteristics, signal classification, Sound, sound data, Spectrogram, surveillance, Time Frequency Analysis, time frequency audio features, time series, time-based features, Time-frequency Analysis, unusual activity, Videos
Abstract

Over the years, technology has reformed the perception of the world related to security concerns. To tackle security problems, we proposed a system capable of detecting security alerts. System encompass audio events that occur as an outlier against background of unusual activity. This ambiguous behaviour can be handled by auditory classification. In this paper, we have discussed two techniques of extracting features from sound data including: time-based and signal based features. In first technique, we preserve time-series nature of sound, while in other signal characteristics are focused. Convolution neural network is applied for categorization of sound. Major aim of research is security challenges, so we have generated data related to surveillance in addition to available datasets such as UrbanSound 8k and ESC-50 datasets. We have achieved 94.6% accuracy for proposed methodology based on self-generated dataset. Improved accuracy on locally prepared dataset demonstrates novelty in research.

URLhttps://ieeexplore.ieee.org/document/8737801
DOI10.1109/COMTECH.2019.8737801
Citation Keyhassan_robust_2019