Visible to the public Biblio

Filters: Keyword is STFT  [Clear All Filters]
2022-01-25
Saleem, Summra, Dilawari, Aniqa, Khan, Usman Ghani.  2021.  Spoofed Voice Detection using Dense Features of STFT and MDCT Spectrograms. 2021 International Conference on Artificial Intelligence (ICAI). :56–61.
Attestation of audio signals for recognition of forgery in voice is challenging task. In this research work, a deep convolutional neural network (CNN) is utilized to detect audio operations i.e. pitch shifted and amplitude varied signals. Short-time Fourier transform (STFT) and Modified Discrete Cosine Transform (MDCT) features are chosen for audio processing and their plotted patterns are fed to CNN. Experimental results show that our model can successfully distinguish tampered signals to facilitate the audio authentication on TIMIT dataset. Proposed CNN architecture can distinguish spoofed voices of shifting pitch with accuracy of 97.55% and of varying amplitude with accuracy of 98.85%.
2015-05-01
Guang Hua, Goh, J., Thing, V.L.L..  2014.  A Dynamic Matching Algorithm for Audio Timestamp Identification Using the ENF Criterion. Information Forensics and Security, IEEE Transactions on. 9:1045-1055.

The electric network frequency (ENF) criterion is a recently developed technique for audio timestamp identification, which involves the matching between extracted ENF signal and reference data. For nearly a decade, conventional matching criterion has been based on the minimum mean squared error (MMSE) or maximum correlation coefficient. However, the corresponding performance is highly limited by low signal-to-noise ratio, short recording durations, frequency resolution problems, and so on. This paper presents a threshold-based dynamic matching algorithm (DMA), which is capable of autocorrecting the noise affected frequency estimates. The threshold is chosen according to the frequency resolution determined by the short-time Fourier transform (STFT) window size. A penalty coefficient is introduced to monitor the autocorrection process and finally determine the estimated timestamp. It is then shown that the DMA generalizes the conventional MMSE method. By considering the mainlobe width in the STFT caused by limited frequency resolution, the DMA achieves improved identification accuracy and robustness against higher levels of noise and the offset problem. Synthetic performance analysis and practical experimental results are provided to illustrate the advantages of the DMA.