CVNov 6, 2025

DORAEMON: A Unified Library for Visual Object Modeling and Representation Learning at Scale

Ke Du, Yimin Peng, Chao Gao, Fan Zhou, Siqiao Xue

arXiv:2511.04394v18.42 citationsh-index: 14Has Code

Originality Synthesis-oriented

AI Analysis

This provides a scalable foundation for rapid experimentation in visual recognition and representation learning, enabling efficient transfer of research advances to real-world applications, though it is incremental as it consolidates existing methods into a unified platform.

The authors introduced DORAEMON, an open-source PyTorch library that unifies visual object modeling and representation learning across scales, offering reproducible recipes that match or exceed reference results on datasets like ImageNet-1K, MS-Celeb-1M, and Stanford online products.

DORAEMON is an open-source PyTorch library that unifies visual object modeling and representation learning across diverse scales. A single YAML-driven workflow covers classification, retrieval and metric learning; more than 1000 pretrained backbones are exposed through a timm-compatible interface, together with modular losses, augmentations and distributed-training utilities. Reproducible recipes match or exceed reference results on ImageNet-1K, MS-Celeb-1M and Stanford online products, while one-command export to ONNX or HuggingFace bridges research and deployment. By consolidating datasets, models, and training techniques into one platform, DORAEMON offers a scalable foundation for rapid experimentation in visual recognition and representation learning, enabling efficient transfer of research advances to real-world applications. The repository is available at https://github.com/wuji3/DORAEMON.

View on arXiv PDF Code

Similar