Biblio
Most anti-collusion audio fingerprinting schemes are aiming at finding colluders from the illegal redistributed audio copies. However, the loss caused by the redistributed versions is inevitable. In this letter, a novel fingerprinting scheme is proposed to eliminate the motivation of collusion attack. The audio signal is transformed to the frequency domain by the Fourier transform, and the coefficients in frequency domain are reversed in different degrees according to the fingerprint sequence. Different from other fingerprinting schemes, the coefficients of the host media are excessively modified by the proposed method in order to reduce the quality of the colluded version significantly, but the imperceptibility is well preserved. Experiments show that the colluded audio cannot be reused because of the poor quality. In addition, the proposed method can also resist other common attacks. Various kinds of copyright risks and losses caused by the illegal redistribution are effectively avoided, which is significant for protecting the copyright of audio.
Voice user interfaces can offer intuitive interaction with our devices, but the usability and audio quality could be further improved if multiple devices could collaborate to provide a distributed voice user interface. To ensure that users' voices are not shared with unauthorized devices, it is however necessary to design an access management system that adapts to the users' needs. Prior work has demonstrated that a combination of audio fingerprinting and fuzzy cryptography yields a robust pairing of devices without sharing the information that they record. However, the robustness of these systems is partially based on the extensive duration of the recordings that are required to obtain the fingerprint. This paper analyzes methods for robust generation of acoustic fingerprints in short periods of time to enable the responsive pairing of devices according to changes in the acoustic scenery and can be integrated into other typical speech processing tools.
An ideal audio retrieval method should be not only highly efficient in identifying an audio track from a massive audio dataset, but also robust to any distortion. Unfortunately, none of the audio retrieval methods is robust to all types of distortions. An audio retrieval method has to do with both the audio fingerprint and the strategy, especially how they are combined. We argue that the Sampling and Counting Method (SC), a state-of-the-art audio retrieval method, would be promising towards an ideal audio retrieval method, if we could make it robust to time-stretch and pitch-stretch. Towards this objective, this paper proposes a turning point alignment method to enhance SC with resistance to time-stretch, which makes Philips and Philips-like fingerprints resist to time-stretch. Experimental results show that our approach can resist to time-stretch from 70% to 130%, which is on a par to the state-of-the-art methods. It also marginally improves the retrieval performance with various noise distortions.
The Philips audio fingerprint[1] has been used for years, but its robustness against external noise has not been studied accurately. This paper shows the Philips fingerprint is noise resistant, and is capable of recognizing music that is corrupted by noise at a -4 to -7 dB signal to noise ratio. In addition, the drawbacks of the Philips fingerprint are addressed by utilizing a “Power Mask” in conjunction with the Philips fingerprint during the matching process. This Power Mask is a weight matrix given to the fingerprint bits, which allows mismatched bits to be penalized according to their relevance in the fingerprint. The effectiveness of the proposed fingerprint was evaluated by experiments using a database of 1030 songs and 1184 query files that were heavily corrupted by two types of noise at varying levels. Our experiments show the proposed method has significantly improved the noise resistance of the standard Philips fingerprint.