CVJul 22, 2024

DriveDiTFit: Fine-tuning Diffusion Transformers for Autonomous Driving

arXiv:2407.15661v113 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This work addresses the expensive and challenging data collection problem for autonomous driving systems, offering an incremental improvement in data generation methods.

The paper tackles the challenge of generating high-quality, diverse training data for autonomous driving by fine-tuning pre-trained Diffusion Transformers, resulting in efficient production of realistic driving data with improved small object generation.

In autonomous driving, deep models have shown remarkable performance across various visual perception tasks with the demand of high-quality and huge-diversity training datasets. Such datasets are expected to cover various driving scenarios with adverse weather, lighting conditions and diverse moving objects. However, manually collecting these data presents huge challenges and expensive cost. With the rapid development of large generative models, we propose DriveDiTFit, a novel method for efficiently generating autonomous Driving data by Fine-tuning pre-trained Diffusion Transformers (DiTs). Specifically, DriveDiTFit utilizes a gap-driven modulation technique to carefully select and efficiently fine-tune a few parameters in DiTs according to the discrepancy between the pre-trained source data and the target driving data. Additionally, DriveDiTFit develops an effective weather and lighting condition embedding module to ensure diversity in the generated data, which is initialized by a nearest-semantic-similarity initialization approach. Through progressive tuning scheme to refined the process of detail generation in early diffusion process and enlarging the weights corresponding to small objects in training loss, DriveDiTFit ensures high-quality generation of small moving objects in the generated data. Extensive experiments conducted on driving datasets confirm that our method could efficiently produce diverse real driving data. The source codes will be available at https://github.com/TtuHamg/DriveDiTFit.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes