CVOct 16, 2025

PIA: Deepfake Detection Using Phoneme-Temporal and Identity-Dynamic Analysis

arXiv:2510.14241v15 citationsh-index: 4Has Code2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Originality Incremental advance
AI Analysis

This addresses the threat of manipulated media for security and verification systems, but it is incremental as it builds on existing multimodal detection approaches.

The paper tackles the problem of detecting modern deepfakes generated by advanced models like GANs and diffusion models, which create subtle temporal discrepancies that traditional methods miss, and presents a multimodal audio-visual framework called PIA that significantly improves detection by analyzing phoneme sequences, lip geometry, and facial identity embeddings.

The rise of manipulated media has made deepfakes a particularly insidious threat, involving various generative manipulations such as lip-sync modifications, face-swaps, and avatar-driven facial synthesis. Conventional detection methods, which predominantly depend on manually designed phoneme-viseme alignment thresholds, fundamental frame-level consistency checks, or a unimodal detection strategy, inadequately identify modern-day deepfakes generated by advanced generative models such as GANs, diffusion models, and neural rendering techniques. These advanced techniques generate nearly perfect individual frames yet inadvertently create minor temporal discrepancies frequently overlooked by traditional detectors. We present a novel multimodal audio-visual framework, Phoneme-Temporal and Identity-Dynamic Analysis(PIA), incorporating language, dynamic face motion, and facial identification cues to address these limitations. We utilize phoneme sequences, lip geometry data, and advanced facial identity embeddings. This integrated method significantly improves the detection of subtle deepfake alterations by identifying inconsistencies across multiple complementary modalities. Code is available at https://github.com/skrantidatta/PIA

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes