SEJun 1
When Large Language Models Meet UAV Projects: An Empirical Study from Developers' PerspectiveYihua Chen, Xingle Que, Jiashuo Zhang et al.
In recent years, unmanned aerial vehicles (UAVs) have become increasingly popular in our daily lives and have attracted significant research interest in software engineering. At the same time, large language models (LLMs) have made notable advancements in language understanding, reasoning, and generation, making LLM applications in UAVs a promising research direction. However, existing studies have largely remained in preliminary exploration with a limited understanding of real-world practice, which causes an academia-industry gap and hinders the application of LLMs in UAVs. To address this, we conducted the first empirical study to investigate how LLMs support UAVs. To characterize common tasks and application scenarios of real-world UAV-LLM practices, we conducted a large-scale empirical study involving 997 research papers and 1,509 GitHub projects. The results classified nine common tasks (e.g., Natural Language Command Parsing) in four UAV workflows (e.g., Information Input) undertaken by LLMs in real-world UAV projects and revealed a large difference in the task distribution of research efforts and industry practices. To gain deeper insight into these differences and understand developers' perspectives on the application of LLMs in UAVs, we conducted a survey of practitioners, receiving 52 valid responses from 15 countries. The results revealed that while 40.4% of developers have attempted to apply LLMs to UAV tasks, 59.6% still face challenges integrating their UAV projects with advanced LLM capabilities. Their feedback attributes these challenges to five factors, including technological maturity, performance, safety, cost, and others, and provides practical implications for researchers and developers in conducting UAV-LLM practices.
CVApr 12, 2023
Fast vehicle detection algorithm based on lightweight YOLO7-tinyBo Li, YiHua Chen, Hao Xu et al.
The swift and precise detection of vehicles plays a significant role in intelligent transportation systems. Current vehicle detection algorithms encounter challenges of high computational complexity, low detection rate, and limited feasibility on mobile devices. To address these issues, this paper proposes a lightweight vehicle detection algorithm based on YOLOv7-tiny (You Only Look Once version seven) called Ghost-YOLOv7. The width of model is scaled to 0.5 and the standard convolution of the backbone network is replaced with Ghost convolution to achieve a lighter network and improve the detection speed; then a self-designed Ghost bi-directional feature pyramid network (Ghost-BiFPN) is embedded into the neck network to enhance feature extraction capability of the algorithm and enriches semantic information; and a Ghost Decouoled Head (GDH) is employed for accurate prediction of vehicle location and species; finally, a coordinate attention mechanism is introduced into the output layer to suppress environmental interference. The WIoU loss function is employed to further enhance the detection accuracy. Ablation experiments results on the PASCAL VOC dataset demonstrate that Ghost-YOLOv7 outperforms the original YOLOv7-tiny model. It achieving a 29.8% reduction in computation, 37.3% reduction in the number of parameters, 35.1% reduction in model weights, 1.1% higher mean average precision (mAP), the detection speed is higher 27FPS compared with the original algorithm. Ghost-YOLOv7 was also compared on KITTI and BIT-vehicle datasets as well, and the results show that this algorithm has the overall best performance.
LGDec 3, 2024Code
Noisy Ostracods: A Fine-Grained, Imbalanced Real-World Dataset for Benchmarking Robust Machine Learning and Label Correction MethodsJiamian Hu, Yuanyuan Hong, Yihua Chen et al.
We present the Noisy Ostracods, a noisy dataset for genus and species classification of crustacean ostracods with specialists' annotations. Over the 71466 specimens collected, 5.58% of them are estimated to be noisy (possibly problematic) at genus level. The dataset is created to addressing a real-world challenge: creating a clean fine-grained taxonomy dataset. The Noisy Ostracods dataset has diverse noises from multiple sources. Firstly, the noise is open-set, including new classes discovered during curation that were not part of the original annotation. The dataset has pseudo-classes, where annotators misclassified samples that should belong to an existing class into a new pseudo-class. The Noisy Ostracods dataset is highly imbalanced with a imbalance factor $ρ$ = 22429. This presents a unique challenge for robust machine learning methods, as existing approaches have not been extensively evaluated on fine-grained classification tasks with such diverse real-world noise. Initial experiments using current robust learning techniques have not yielded significant performance improvements on the Noisy Ostracods dataset compared to cross-entropy training on the raw, noisy data. On the other hand, noise detection methods have underperformed in error hit rate compared to naive cross-validation ensembling for identifying problematic labels. These findings suggest that the fine-grained, imbalanced nature, and complex noise characteristics of the dataset present considerable challenges for existing noise-robust algorithms. By openly releasing the Noisy Ostracods dataset, our goal is to encourage further research into the development of noise-resilient machine learning methods capable of effectively handling diverse, real-world noise in fine-grained classification tasks. The dataset, along with its evaluation protocols, can be accessed at https://github.com/H-Jamieu/Noisy_ostracods.
LGJun 7, 2021Code
DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task LearningHussein Hazimeh, Zhe Zhao, Aakanksha Chowdhery et al.
The Mixture-of-Experts (MoE) architecture is showing promising results in improving parameter sharing in multi-task learning (MTL) and in scaling high-capacity neural networks. State-of-the-art MoE models use a trainable sparse gate to select a subset of the experts for each input example. While conceptually appealing, existing sparse gates, such as Top-k, are not smooth. The lack of smoothness can lead to convergence and statistical performance issues when training with gradient-based methods. In this paper, we develop DSelect-k: a continuously differentiable and sparse gate for MoE, based on a novel binary encoding formulation. The gate can be trained using first-order methods, such as stochastic gradient descent, and offers explicit control over the number of experts to select. We demonstrate the effectiveness of DSelect-k on both synthetic and real MTL datasets with up to $128$ tasks. Our experiments indicate that DSelect-k can achieve statistically significant improvements in prediction and expert selection over popular MoE gates. Notably, on a real-world, large-scale recommender system, DSelect-k achieves over $22\%$ improvement in predictive performance compared to Top-k. We provide an open-source implementation of DSelect-k.