CVJun 24, 2025

Automated Image Recognition Framework

Quang-Binh Nguyen, Trong-Vu Hoang, Ngoc-Do Tran, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

arXiv:2506.19261v13.6h-index: 29ICCCI

Originality Incremental advance

AI Analysis

This addresses data collection challenges for end-users in AI, particularly for novel or sensitive tasks, though it appears incremental as it builds on existing generative AI methods.

The paper tackles the problem of data scarcity and annotation costs in deep learning by proposing an Automated Image Recognition (AIR) framework that uses generative AI to synthesize high-quality, pre-annotated datasets and automatically train models, achieving a user study score of 4.4 out of 5.0.

While the efficacy of deep learning models heavily relies on data, gathering and annotating data for specific tasks, particularly when addressing novel or sensitive subjects lacking relevant datasets, poses significant time and resource challenges. In response to this, we propose a novel Automated Image Recognition (AIR) framework that harnesses the power of generative AI. AIR empowers end-users to synthesize high-quality, pre-annotated datasets, eliminating the necessity for manual labeling. It also automatically trains deep learning models on the generated datasets with robust image recognition performance. Our framework includes two main data synthesis processes, AIR-Gen and AIR-Aug. The AIR-Gen enables end-users to seamlessly generate datasets tailored to their specifications. To improve image quality, we introduce a novel automated prompt engineering module that leverages the capabilities of large language models. We also introduce a distribution adjustment algorithm to eliminate duplicates and outliers, enhancing the robustness and reliability of generated datasets. On the other hand, the AIR-Aug enhances a given dataset, thereby improving the performance of deep classifier models. AIR-Aug is particularly beneficial when users have limited data for specific tasks. Through comprehensive experiments, we demonstrated the efficacy of our generated data in training deep learning models and showcased the system's potential to provide image recognition models for a wide range of objects. We also conducted a user study that achieved an impressive score of 4.4 out of 5.0, underscoring the AI community's positive perception of AIR.

View on arXiv PDF

Similar