Robust Sound Classification for Surveillance using Time Frequency Audio Features
Title | Robust Sound Classification for Surveillance using Time Frequency Audio Features |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Hassan, S. U., Khan, M. Zeeshan, Khan, M. U. Ghani, Saleem, S. |
Conference Name | 2019 International Conference on Communication Technologies (ComTech) |
Date Published | March 2019 |
Publisher | IEEE |
ISBN Number | 978-1-5386-5106-3 |
Keywords | Acoustic signal processing, ambiguous behaviour, audio events, audio signal processing, auditory classification, convolution, convolution neural network, convolutional neural nets, ESC-50 datasets, feature extraction, learning (artificial intelligence), Mel frequency cepstral coefficient, Mel Spectrogram, Metrics, MFCC, Neural networks, outlier, perception, pubcrawl, resilience, Resiliency, robust sound classification, Scalability, security, security alerts, security of data, self-generated dataset, signal based features, signal characteristics, signal classification, Sound, sound data, Spectrogram, surveillance, Time Frequency Analysis, time frequency audio features, time series, time-based features, Time-frequency Analysis, unusual activity, Videos |
Abstract | Over the years, technology has reformed the perception of the world related to security concerns. To tackle security problems, we proposed a system capable of detecting security alerts. System encompass audio events that occur as an outlier against background of unusual activity. This ambiguous behaviour can be handled by auditory classification. In this paper, we have discussed two techniques of extracting features from sound data including: time-based and signal based features. In first technique, we preserve time-series nature of sound, while in other signal characteristics are focused. Convolution neural network is applied for categorization of sound. Major aim of research is security challenges, so we have generated data related to surveillance in addition to available datasets such as UrbanSound 8k and ESC-50 datasets. We have achieved 94.6% accuracy for proposed methodology based on self-generated dataset. Improved accuracy on locally prepared dataset demonstrates novelty in research. |
URL | https://ieeexplore.ieee.org/document/8737801 |
DOI | 10.1109/COMTECH.2019.8737801 |
Citation Key | hassan_robust_2019 |
- sound data
- robust sound classification
- Scalability
- security
- security alerts
- security of data
- self-generated dataset
- signal based features
- signal characteristics
- signal classification
- sound
- Resiliency
- Spectrogram
- surveillance
- Time Frequency Analysis
- time frequency audio features
- time series
- time-based features
- Time-frequency Analysis
- unusual activity
- Videos
- learning (artificial intelligence)
- ambiguous behaviour
- audio events
- audio signal processing
- auditory classification
- convolution
- convolution neural network
- convolutional neural nets
- ESC-50 datasets
- feature extraction
- Acoustic signal processing
- Mel frequency cepstral coefficient
- Mel Spectrogram
- Metrics
- MFCC
- Neural networks
- outlier
- perception
- pubcrawl
- resilience