CL CVJul 21, 2025

Smart Eyes for Silent Threats: VLMs and In-Context Learning for THz Imaging

Nicolas Poggi, Shashank Agnihotri, Margret Keuper

arXiv:2507.15576v14.91 citationsh-index: 17Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses classification problems in resource-constrained scientific domains like security screening and material classification using THz imaging, but it is incremental as it applies existing ICL and VLM methods to a new domain.

The paper tackles the challenge of effective image classification in terahertz (THz) imaging, which suffers from limited annotations and visual ambiguity, by introducing In-Context Learning (ICL) with Vision-Language Models (VLMs) as a flexible, interpretable alternative that requires no fine-tuning, showing improved classification and interpretability in low-data regimes.

Terahertz (THz) imaging enables non-invasive analysis for applications such as security screening and material classification, but effective image classification remains challenging due to limited annotations, low resolution, and visual ambiguity. We introduce In-Context Learning (ICL) with Vision-Language Models (VLMs) as a flexible, interpretable alternative that requires no fine-tuning. Using a modality-aligned prompting framework, we adapt two open-weight VLMs to the THz domain and evaluate them under zero-shot and one-shot settings. Our results show that ICL improves classification and interpretability in low-data regimes. This is the first application of ICL-enhanced VLMs to THz imaging, offering a promising direction for resource-constrained scientific domains. Code: \href{https://github.com/Nicolas-Poggi/Project_THz_Classification/tree/main}{GitHub repository}.

View on arXiv PDF Code

Similar