An audio fingerprinting system for live version identification using image processing techniques
Title | An audio fingerprinting system for live version identification using image processing techniques |
Publication Type | Conference Paper |
Year of Publication | 2014 |
Authors | Rafii, Z., Coover, B., Jinyu Han |
Conference Name | Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on |
Date Published | May |
Keywords | adaptive thresholding, adaptive thresholding method, Audio fingerprinting, audio fingerprinting system, audio signal processing, compact fingerprints, Constant Q Transform, cover identification, Degradation, fingerprint identification, Hamming similarity, Hough Transform, Hough transforms, image processing techniques, image segmentation, live version identification, log-frequency spectrogram, music festival, Robustness, smartphone, Spectrogram, Speech, speech processing, template matching, Time-frequency Analysis, Transforms |
Abstract | Suppose that you are at a music festival checking on an artist, and you would like to quickly know about the song that is being played (e.g., title, lyrics, album, etc.). If you have a smartphone, you could record a sample of the live performance and compare it against a database of existing recordings from the artist. Services such as Shazam or SoundHound will not work here, as this is not the typical framework for audio fingerprinting or query-by-humming systems, as a live performance is neither identical to its studio version (e.g., variations in instrumentation, key, tempo, etc.) nor it is a hummed or sung melody. We propose an audio fingerprinting system that can deal with live version identification by using image processing techniques. Compact fingerprints are derived using a log-frequency spectrogram and an adaptive thresholding method, and template matching is performed using the Hamming similarity and the Hough Transform. |
DOI | 10.1109/ICASSP.2014.6853675 |
Citation Key | 6853675 |
- image processing techniques
- Transforms
- Time-frequency Analysis
- template matching
- speech processing
- Speech
- Spectrogram
- smartphone
- Robustness
- music festival
- log-frequency spectrogram
- live version identification
- image segmentation
- adaptive thresholding
- Hough transforms
- Hough Transform
- Hamming similarity
- fingerprint identification
- Degradation
- cover identification
- Constant Q Transform
- compact fingerprints
- audio signal processing
- audio fingerprinting system
- Audio fingerprinting
- adaptive thresholding method