Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection
This work is significant for researchers and practitioners working on 3D object detection, as it provides a method to train a single model robustly across diverse datasets and generalize to unseen domains, overcoming the limitations of simply merging datasets.
This paper introduces Uni^2Det, a framework for unified and universal multi-dataset training in 3D object detection. It addresses challenges from data distribution and taxonomy disparities by using multi-stage prompting modules, achieving superior performance in multi-dataset training across KITTI, Waymo, and nuScenes, and demonstrating generalization in zero-shot cross-dataset transfer.
We present Uni$^2$Det, a brand new framework for unified and universal multi-dataset training on 3D detection, enabling robust performance across diverse domains and generalization to unseen domains. Due to substantial disparities in data distribution and variations in taxonomy across diverse domains, training such a detector by simply merging datasets poses a significant challenge. Motivated by this observation, we introduce multi-stage prompting modules for multi-dataset 3D detection, which leverages prompts based on the characteristics of corresponding datasets to mitigate existing differences. This elegant design facilitates seamless plug-and-play integration within various advanced 3D detection frameworks in a unified manner, while also allowing straightforward adaptation for universal applicability across datasets. Experiments are conducted across multiple dataset consolidation scenarios involving KITTI, Waymo, and nuScenes, demonstrating that our Uni$^2$Det outperforms existing methods by a large margin in multi-dataset training. Notably, results on zero-shot cross-dataset transfer validate the generalization capability of our proposed method.