Visible to the public An audio fingerprinting system for live version identification using image processing techniques

TitleAn audio fingerprinting system for live version identification using image processing techniques
Publication TypeConference Paper
Year of Publication2014
AuthorsRafii, Z., Coover, B., Jinyu Han
Conference NameAcoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Date PublishedMay
Keywordsadaptive thresholding, adaptive thresholding method, Audio fingerprinting, audio fingerprinting system, audio signal processing, compact fingerprints, Constant Q Transform, cover identification, Degradation, fingerprint identification, Hamming similarity, Hough Transform, Hough transforms, image processing techniques, image segmentation, live version identification, log-frequency spectrogram, music festival, Robustness, smartphone, Spectrogram, Speech, speech processing, template matching, Time-frequency Analysis, Transforms
Abstract

Suppose that you are at a music festival checking on an artist, and you would like to quickly know about the song that is being played (e.g., title, lyrics, album, etc.). If you have a smartphone, you could record a sample of the live performance and compare it against a database of existing recordings from the artist. Services such as Shazam or SoundHound will not work here, as this is not the typical framework for audio fingerprinting or query-by-humming systems, as a live performance is neither identical to its studio version (e.g., variations in instrumentation, key, tempo, etc.) nor it is a hummed or sung melody. We propose an audio fingerprinting system that can deal with live version identification by using image processing techniques. Compact fingerprints are derived using a log-frequency spectrogram and an adaptive thresholding method, and template matching is performed using the Hamming similarity and the Hough Transform.

DOI10.1109/ICASSP.2014.6853675
Citation Key6853675