Biblio
The electric network frequency (ENF) signal can be captured in multimedia recordings due to electromagnetic influences from the power grid at the time of recording. Recent work has exploited the ENF signals for forensic applications, such as authenticating and detecting forgery of ENF-containing multimedia signals, and inferring their time and location of creation. In this paper, we explore a new potential of ENF signals for automatic synchronization of audio and video. The ENF signal as a time-varying random process can be used as a timing fingerprint of multimedia signals. Synchronization of audio and video recordings can be achieved by aligning their embedded ENF signals. We demonstrate the proposed scheme with two applications: multi-view video synchronization and synchronization of historical audio recordings. The experimental results show the ENF based synchronization approach is effective, and has the potential to solve problems that are intractable by other existing methods.
In this paper, an edit detection method for forensic audio analysis is proposed. It develops and improves a previous method through changes in the signal processing chain and a novel detection criterion. As with the original method, electrical network frequency (ENF) analysis is central to the novel edit detector, for it allows monitoring anomalous variations of the ENF related to audio edit events. Working in unsupervised manner, the edit detector compares the extent of ENF variations, centered at its nominal frequency, with a variable threshold that defines the upper limit for normal variations observed in unedited signals. The ENF variations caused by edits in the signal are likely to exceed the threshold providing a mechanism for their detection. The proposed method is evaluated in both qualitative and quantitative terms via two distinct annotated databases. Results are reported for originally noisy database signals as well as versions of them further degraded under controlled conditions. A comparative performance evaluation, in terms of equal error rate (EER) detection, reveals that, for one of the tested databases, an improvement from 7% to 4% EER is achieved, respectively, from the original to the new edit detection method. When the signals are amplitude clipped or corrupted by broadband background noise, the performance figures of the novel method follow the same profile of those of the original method.
The electric network frequency (ENF) criterion is a recently developed technique for audio timestamp identification, which involves the matching between extracted ENF signal and reference data. For nearly a decade, conventional matching criterion has been based on the minimum mean squared error (MMSE) or maximum correlation coefficient. However, the corresponding performance is highly limited by low signal-to-noise ratio, short recording durations, frequency resolution problems, and so on. This paper presents a threshold-based dynamic matching algorithm (DMA), which is capable of autocorrecting the noise affected frequency estimates. The threshold is chosen according to the frequency resolution determined by the short-time Fourier transform (STFT) window size. A penalty coefficient is introduced to monitor the autocorrection process and finally determine the estimated timestamp. It is then shown that the DMA generalizes the conventional MMSE method. By considering the mainlobe width in the STFT caused by limited frequency resolution, the DMA achieves improved identification accuracy and robustness against higher levels of noise and the offset problem. Synthetic performance analysis and practical experimental results are provided to illustrate the advantages of the DMA.