Online Feedback Efficient Active Target Discovery in Partially Observable Environments
This addresses the challenge of costly data acquisition in scientific and engineering fields by providing an interpretable, unsupervised method for active target discovery, though it appears incremental as it builds on existing active learning and diffusion model concepts.
The paper tackles the problem of efficiently discovering targets in partially observable environments with limited sampling budgets, such as in medical imaging or remote sensing, by introducing DiffATD, which uses diffusion dynamics to balance exploration and exploitation without prior supervised training, achieving performance competitive with supervised methods in experiments across diverse domains.
In various scientific and engineering domains, where data acquisition is costly--such as in medical imaging, environmental monitoring, or remote sensing--strategic sampling from unobserved regions, guided by prior observations, is essential to maximize target discovery within a limited sampling budget. In this work, we introduce Diffusion-guided Active Target Discovery (DiffATD), a novel method that leverages diffusion dynamics for active target discovery. DiffATD maintains a belief distribution over each unobserved state in the environment, using this distribution to dynamically balance exploration-exploitation. Exploration reduces uncertainty by sampling regions with the highest expected entropy, while exploitation targets areas with the highest likelihood of discovering the target, indicated by the belief distribution and an incrementally trained reward model designed to learn the characteristics of the target. DiffATD enables efficient target discovery in a partially observable environment within a fixed sampling budget, all without relying on any prior supervised training. Furthermore, DiffATD offers interpretability, unlike existing black--box policies that require extensive supervised training. Through extensive experiments and ablation studies across diverse domains, including medical imaging, species discovery, and remote sensing, we show that DiffATD performs significantly better than baselines and competitively with supervised methods that operate under full environmental observability.