Visible to the public Biblio

Filters: Keyword is MFCC  [Clear All Filters]
2021-07-08
Hou, Dai, Han, Hao, Novak, Ed.  2020.  TAES: Two-factor Authentication with End-to-End Security against VoIP Phishing. 2020 IEEE/ACM Symposium on Edge Computing (SEC). :340—345.
In the current state of communication technology, the abuse of VoIP has led to the emergence of telecommunications fraud. We urgently need an end-to-end identity authentication mechanism to verify the identity of the caller. This paper proposes an end-to-end, dual identity authentication mechanism to solve the problem of telecommunications fraud. Our first technique is to use the Hermes algorithm of data transmission technology on an unknown voice channel to transmit the certificate, thereby authenticating the caller's phone number. Our second technique uses voice-print recognition technology and a Gaussian mixture model (a general background probabilistic model) to establish a model of the speaker to verify the caller's voice to ensure the speaker's identity. Our solution is implemented on the Android platform, and simultaneously tests and evaluates transmission efficiency and speaker recognition. Experiments conducted on Android phones show that the error rate of the voice channel transmission signature certificate is within 3.247 %, and the certificate signature verification mechanism is feasible. The accuracy of the voice-print recognition is 72%, making it effective as a reference for identity authentication.
2020-12-11
Hassan, S. U., Khan, M. Zeeshan, Khan, M. U. Ghani, Saleem, S..  2019.  Robust Sound Classification for Surveillance using Time Frequency Audio Features. 2019 International Conference on Communication Technologies (ComTech). :13—18.

Over the years, technology has reformed the perception of the world related to security concerns. To tackle security problems, we proposed a system capable of detecting security alerts. System encompass audio events that occur as an outlier against background of unusual activity. This ambiguous behaviour can be handled by auditory classification. In this paper, we have discussed two techniques of extracting features from sound data including: time-based and signal based features. In first technique, we preserve time-series nature of sound, while in other signal characteristics are focused. Convolution neural network is applied for categorization of sound. Major aim of research is security challenges, so we have generated data related to surveillance in addition to available datasets such as UrbanSound 8k and ESC-50 datasets. We have achieved 94.6% accuracy for proposed methodology based on self-generated dataset. Improved accuracy on locally prepared dataset demonstrates novelty in research.

2020-08-03
Liu, Fuxiang, Jiang, Qi.  2019.  Research on Recognition of Criminal Suspects Based on Foot Sounds. 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). :1347–1351.
There are two main contributions in this paper: Firstly, by analyzing the frequency domain features and Mel domain features, we can identify footstep events and non-footstep events. Secondly, we compared the two footstep sound signals of the same person in frequency domain under different experimental conditions, finding that almost all of their peak frequencies and trough frequencies in the main frequency band are respectively corresponding one-to-one. However for the two different people, even under the same experimental conditions, it is difficult to have the same peak frequencies and trough frequencies in the main frequency band of their footstep sound signals. Therefore, this feature of footstep sound signals can be used to identify different people.
2019-01-16
Kreuk, F., Adi, Y., Cisse, M., Keshet, J..  2018.  Fooling End-To-End Speaker Verification With Adversarial Examples. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :1962–1966.
Automatic speaker verification systems are increasingly used as the primary means to authenticate costumers. Recently, it has been proposed to train speaker verification systems using end-to-end deep neural models. In this paper, we show that such systems are vulnerable to adversarial example attacks. Adversarial examples are generated by adding a peculiar noise to original speaker examples, in such a way that they are almost indistinguishable, by a human listener. Yet, the generated waveforms, which sound as speaker A can be used to fool such a system by claiming as if the waveforms were uttered by speaker B. We present white-box attacks on a deep end-to-end network that was either trained on YOHO or NTIMIT. We also present two black-box attacks. In the first one, we generate adversarial examples with a system trained on NTIMIT and perform the attack on a system that trained on YOHO. In the second one, we generate the adversarial examples with a system trained using Mel-spectrum features and perform the attack on a system trained using MFCCs. Our results show that one can significantly decrease the accuracy of a target system even when the adversarial examples are generated with different system potentially using different features.