Visible to the public Enhancing Sampling and Counting Method for Audio Retrieval with Time-Stretch Resistance

TitleEnhancing Sampling and Counting Method for Audio Retrieval with Time-Stretch Resistance
Publication TypeConference Paper
Year of Publication2018
AuthorsYao, S., Niu, B., Liu, J.
Conference Name2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM)
Date PublishedSept. 2018
PublisherIEEE
ISBN Number978-1-5386-5321-0
KeywordsAcoustic Fingerprints, Acoustic signal processing, audio fingerprint, audio retrieval, audio signal processing, audio track, composability, distortion, Filtering, fingerprint matching, Fingerprint recognition, Human Behavior, ideal audio retrieval method, Indexes, information retrieval, LSH, massive audio dataset, noise distortions, Philips-like fingerprints, pubcrawl, Resiliency, Resistance, Resists, retrieval performance, Sampling and Counting Method, state-of-the-art audio retrieval method, state-of-the-art methods, time-stretch, time-stretch resistance, Turning, turning point alignment method
Abstract

An ideal audio retrieval method should be not only highly efficient in identifying an audio track from a massive audio dataset, but also robust to any distortion. Unfortunately, none of the audio retrieval methods is robust to all types of distortions. An audio retrieval method has to do with both the audio fingerprint and the strategy, especially how they are combined. We argue that the Sampling and Counting Method (SC), a state-of-the-art audio retrieval method, would be promising towards an ideal audio retrieval method, if we could make it robust to time-stretch and pitch-stretch. Towards this objective, this paper proposes a turning point alignment method to enhance SC with resistance to time-stretch, which makes Philips and Philips-like fingerprints resist to time-stretch. Experimental results show that our approach can resist to time-stretch from 70% to 130%, which is on a par to the state-of-the-art methods. It also marginally improves the retrieval performance with various noise distortions.

URLhttps://ieeexplore.ieee.org/document/8499068
DOI10.1109/BigMM.2018.8499068
Citation Keyyao_enhancing_2018