LGJun 25, 2024

Camera Model Identification Using Audio and Visual Content from Videos

arXiv:2406.17916v11 citations
Originality Synthesis-oriented
AI Analysis

This addresses multimedia forensics for identifying camera models, but it is incremental as it applies existing CNN methods to a fusion task without major breakthroughs.

The paper tackled device identification from videos by using audio, visual content, or their fusion with product and sum rules, achieving promising classification performance with individual modalities, though fusion did not consistently outperform both.

The identification of device brands and models plays a pivotal role in the realm of multimedia forensic applications. This paper presents a framework capable of identifying devices using audio, visual content, or a fusion of them. The fusion of visual and audio content occurs later by applying two fundamental fusion rules: the product and the sum. The device identification problem is tackled as a classification one by leveraging Convolutional Neural Networks. Experimental evaluation illustrates that the proposed framework exhibits promising classification performance when independently using audio or visual content. Furthermore, although the fusion results don't consistently surpass both individual modalities, they demonstrate promising potential for enhancing classification performance. Future research could refine the fusion process to improve classification performance in both modalities consistently. Finally, a statistical significance test is performed for a more in-depth study of the classification results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes