AerialGo: Walking-through City View Generation from Aerial Perspectives
This provides a privacy-conscious, scalable solution for urban planning, navigation, and AR/VR applications, though it is incremental as it builds on existing multi-view diffusion models.
The paper tackles the problem of generating realistic walking-through city views from aerial images to avoid privacy concerns and labor-intensive ground-level data collection, achieving significant enhancements in ground-level realism and structural coherence.
High-quality 3D urban reconstruction is essential for applications in urban planning, navigation, and AR/VR. However, capturing detailed ground-level data across cities is both labor-intensive and raises significant privacy concerns related to sensitive information, such as vehicle plates, faces, and other personal identifiers. To address these challenges, we propose AerialGo, a novel framework that generates realistic walking-through city views from aerial images, leveraging multi-view diffusion models to achieve scalable, photorealistic urban reconstructions without direct ground-level data collection. By conditioning ground-view synthesis on accessible aerial data, AerialGo bypasses the privacy risks inherent in ground-level imagery. To support the model training, we introduce AerialGo dataset, a large-scale dataset containing diverse aerial and ground-view images, paired with camera and depth information, designed to support generative urban reconstruction. Experiments show that AerialGo significantly enhances ground-level realism and structural coherence, providing a privacy-conscious, scalable solution for city-scale 3D modeling.