Biblio
``Style transfer'' among images has recently emerged as a very active research topic, fuelled by the power of convolution neural networks (CNNs), and has become fast a very popular technology in social media. This paper investigates the analogous problem in the audio domain: How to transfer the style of a reference audio signal to a target audio content? We propose a flexible framework for the task, which uses a sound texture model to extract statistics characterizing the reference audio style, followed by an optimization-based audio texture synthesis to modify the target content. In contrast to mainstream optimization-based visual transfer method, the proposed process is initialized by the target content instead of random noise and the optimized loss is only about texture, not structure. These differences proved key for audio style transfer in our experiments. In order to extract features of interest, we investigate different architectures, whether pre-trained on other tasks, as done in image style transfer, or engineered based on the human auditory system. Experimental results on different types of audio signal confirm the potential of the proposed approach.
Biometrics is attracting increasing attention in privacy and security concerned issues, such as access control and remote financial transaction. However, advanced forgery and spoofing techniques are threatening the reliability of conventional biometric modalities. This has been motivating our investigation of a novel yet promising modality transient evoked otoacoustic emission (TEOAE), which is an acoustic response generated from cochlea after a click stimulus. Unlike conventional modalities that are easily accessible or captured, TEOAE is naturally immune to replay and falsification attacks as a physiological outcome from human auditory system. In this paper, we resort to wavelet analysis to derive the time-frequency representation of such nonstationary signal, which reveals individual uniqueness and long-term reproducibility. A machine learning technique linear discriminant analysis is subsequently utilized to reduce intrasubject variability and further capture intersubject differentiation features. Considering practical application, we also introduce a complete framework of the biometric system in both verification and identification modes. Comparative experiments on a TEOAE data set of biometric setting show the merits of the proposed method. Performance is further improved with fusion of information from both ears.