CVAIJul 3, 2025

LATTE: Latent Trajectory Embedding for Diffusion-Generated Image Detection

arXiv:2507.03054v23 citationsh-index: 22Has Code
Originality Highly original
AI Analysis

This addresses the critical need for reliable generated image detectors to maintain trust in digital media, representing a novel method for a known bottleneck in detection.

The paper tackles the problem of detecting diffusion-generated images to combat eroding trust in digital media, proposing LATTE which models latent trajectory embeddings across denoising steps and achieves superior performance on benchmarks like GenImage, Chameleon, and Diffusion Forensics, especially in cross-generator and cross-dataset scenarios.

The rapid advancement of diffusion-based image generators has made it increasingly difficult to distinguish generated from real images. This erodes trust in digital media, making it critical to develop generated image detectors that remain reliable across different generators. While recent approaches leverage diffusion denoising cues, they typically rely on single-step reconstruction errors and overlook the sequential nature of the denoising process. In this work, we propose LATTE - LATent Trajectory Embedding - a novel approach that models the evolution of latent embeddings across multiple denoising steps. Instead of treating each denoising step in isolation, LATTE captures the trajectory of these representations, revealing subtle and discriminative patterns that distinguish real from generated images. Experiments on several benchmarks, such as GenImage, Chameleon, and Diffusion Forensics, show that LATTE achieves superior performance, especially in challenging cross-generator and cross-dataset scenarios, highlighting the potential of latent trajectory modeling. The code is available on the following link: https://github.com/AnaMVasilcoiu/LATTE-Diffusion-Detector.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes