LGApr 18, 2022
DeepCore: A Comprehensive Library for Coreset Selection in Deep LearningChengcheng Guo, Bo Zhao, Yanbing Bai
Coreset selection, which aims to select a subset of the most informative training samples, is a long-standing learning problem that can benefit many downstream tasks such as data-efficient learning, continual learning, neural architecture search, active learning, etc. However, many existing coreset selection methods are not designed for deep learning, which may have high complexity and poor generalization performance. In addition, the recently proposed methods are evaluated on models, datasets, and settings of different complexities. To advance the research of coreset selection in deep learning, we contribute a comprehensive code library, namely DeepCore, and provide an empirical study on popular coreset selection methods on CIFAR10 and ImageNet datasets. Extensive experiments on CIFAR10 and ImageNet datasets verify that, although various methods have advantages in certain experiment settings, random selection is still a strong baseline.
CVAug 31, 2024
Streamlining Forest Wildfire Surveillance: AI-Enhanced UAVs Utilizing the FLAME Aerial Video Dataset for Lightweight and Efficient MonitoringLemeng Zhao, Junjie Hu, Jianchao Bi et al.
In recent years, unmanned aerial vehicles (UAVs) have played an increasingly crucial role in supporting disaster emergency response efforts by analyzing aerial images. While current deep-learning models focus on improving accuracy, they often overlook the limited computing resources of UAVs. This study recognizes the imperative for real-time data processing in disaster response scenarios and introduces a lightweight and efficient approach for aerial video understanding. Our methodology identifies redundant portions within the video through policy networks and eliminates this excess information using frame compression techniques. Additionally, we introduced the concept of a `station point,' which leverages future information in the sequential policy network, thereby enhancing accuracy. To validate our method, we employed the wildfire FLAME dataset. Compared to the baseline, our approach reduces computation costs by more than 13 times while boosting accuracy by 3$\%$. Moreover, our method can intelligently select salient frames from the video, refining the dataset. This feature enables sophisticated models to be effectively trained on a smaller dataset, significantly reducing the time spent during the training process.
CVMay 31, 2022
Self-Supervised Learning for Building Damage Assessment from Large-scale xBD Satellite Imagery Benchmark DatasetsZaishuo Xia, Zelin Li, Yanbing Bai et al.
In the field of post-disaster assessment, for timely and accurate rescue and localization after a disaster, people need to know the location of damaged buildings. In deep learning, some scholars have proposed methods to make automatic and highly accurate building damage assessments by remote sensing images, which are proved to be more efficient than assessment by domain experts. However, due to the lack of a large amount of labeled data, these kinds of tasks can suffer from being able to do an accurate assessment, as the efficiency of deep learning models relies highly on labeled data. Although existing semi-supervised and unsupervised studies have made breakthroughs in this area, none of them has completely solved this problem. Therefore, we propose adopting a self-supervised comparative learning approach to address the task without the requirement of labeled data. We constructed a novel asymmetric twin network architecture and tested its performance on the xBD dataset. Experiment results of our model show the improvement compared to baseline and commonly used methods. We also demonstrated the potential of self-supervised methods for building damage recognition awareness.
CVApr 28, 2024
Flood Data Analysis on SpaceNet 8 Using Apache SedonaYanbing Bai, Zihao Yang, Jinze Yu et al.
With the escalating frequency of floods posing persistent threats to human life and property, satellite remote sensing has emerged as an indispensable tool for monitoring flood hazards. SpaceNet8 offers a unique opportunity to leverage cutting-edge artificial intelligence technologies to assess these hazards. A significant contribution of this research is its application of Apache Sedona, an advanced platform specifically designed for the efficient and distributed processing of large-scale geospatial data. This platform aims to enhance the efficiency of error analysis, a critical aspect of improving flood damage detection accuracy. Based on Apache Sedona, we introduce a novel approach that addresses the challenges associated with inaccuracies in flood damage detection. This approach involves the retrieval of cases from historical flood events, the adaptation of these cases to current scenarios, and the revision of the model based on clustering algorithms to refine its performance. Through the replication of both the SpaceNet8 baseline and its top-performing models, we embark on a comprehensive error analysis. This analysis reveals several main sources of inaccuracies. To address these issues, we employ data visual interpretation and histogram equalization techniques, resulting in significant improvements in model metrics. After these enhancements, our indicators show a notable improvement, with precision up by 5%, F1 score by 2.6%, and IoU by 4.5%. This work highlights the importance of advanced geospatial data processing tools, such as Apache Sedona. By improving the accuracy and efficiency of flood detection, this research contributes to safeguarding public safety and strengthening infrastructure resilience in flood-prone areas, making it a valuable addition to the field of remote sensing and disaster management.
CVAug 22, 2025
Two-Stage Framework for Efficient UAV-Based Wildfire Video Analysis with Adaptive Compression and Fire Source DetectionYanbing Bai, Rui-Yang Ju, Lemeng Zhao et al.
Unmanned Aerial Vehicles (UAVs) have become increasingly important in disaster emergency response by enabling real-time aerial video analysis. Due to the limited computational resources available on UAVs, large models cannot be run independently for real-time analysis. To overcome this challenge, we propose a lightweight and efficient two-stage framework for real-time wildfire monitoring and fire source detection on UAV platforms. Specifically, in Stage 1, we utilize a policy network to identify and discard redundant video clips using frame compression techniques, thereby reducing computational costs. In addition, we introduce a station point mechanism that leverages future frame information within the sequential policy network to improve prediction accuracy. In Stage 2, once the frame is classified as "fire", we employ the improved YOLOv8 model to localize the fire source. We evaluate the Stage 1 method using the FLAME and HMDB51 datasets, and the Stage 2 method using the Fire & Smoke dataset. Experimental results show that our method significantly reduces computational costs while maintaining classification accuracy in Stage 1, and achieves higher detection accuracy with similar inference time in Stage 2 compared to baseline methods.
CVDec 13, 2024
CrossVIT-augmented Geospatial-Intelligence Visualization System for Tracking Economic Development DynamicsYanbing Bai, Jinhua Su, Bin Qiao et al.
Timely and accurate economic data is crucial for effective policymaking. Current challenges in data timeliness and spatial resolution can be addressed with advancements in multimodal sensing and distributed computing. We introduce Senseconomic, a scalable system for tracking economic dynamics via multimodal imagery and deep learning. Built on the Transformer framework, it integrates remote sensing and street view images using cross-attention, with nighttime light data as weak supervision. The system achieved an R-squared value of 0.8363 in county-level economic predictions and halved processing time to 23 minutes using distributed computing. Its user-friendly design includes a Vue3-based front end with Baidu maps for visualization and a Python-based back end automating tasks like image downloads and preprocessing. Senseconomic empowers policymakers and researchers with efficient tools for resource allocation and economic planning.
CVApr 28, 2024
FAD-SAR: A Novel Fishing Activity Detection System via Synthetic Aperture Radar Images Based on Deep Learning MethodYanbing Bai, Siao Li, Rui-Yang Ju et al.
Illegal, unreported, and unregulated (IUU) fishing activities seriously affect various aspects of human life. However, traditional methods for detecting and monitoring IUU fishing activities at sea have limitations. Although synthetic aperture radar (SAR) can complement existing vessel detection systems, extracting useful information from SAR images using traditional methods remains a challenge, especially in IUU fishing. This paper proposes a deep learning based fishing activity detection system, which is implemented on the xView3 dataset using six classical object detection models: SSD, RetinaNet, FSAF, FCOS, Faster R-CNN, and Cascade R-CNN. In addition, this work employs different enhancement techniques to improve the performance of the Faster R-CNN model. The experimental results demonstrate that training the Faster R-CNN model using the Online Hard Example Mining (OHEM) strategy increases the Avg-F1 value from 0.212 to 0.216.