68.5CVMar 13Code
Prompt-Driven Lightweight Foundation Model for Instance Segmentation-Based Fault Detection in Freight TrainsGuodong Sun, Qihang Liang, Xingyu Pan et al.
Accurate visual fault detection in freight trains remains a critical challenge for intelligent transportation system maintenance, due to complex operational environments, structurally repetitive components, and frequent occlusions or contaminations in safety-critical regions. Conventional instance segmentation methods based on convolutional neural networks and Transformers often suffer from poor generalization and limited boundary accuracy under such conditions. To address these challenges, we propose a lightweight self-prompted instance segmentation framework tailored for freight train fault detection. Our method leverages the Segment Anything Model by introducing a self-prompt generation module that automatically produces task-specific prompts, enabling effective knowledge transfer from foundation models to domain-specific inspection tasks. In addition, we adopt a Tiny Vision Transformer backbone to reduce computational cost, making the framework suitable for real-time deployment on edge devices in railway monitoring systems. We construct a domain-specific dataset collected from real-world freight inspection stations and conduct extensive evaluations. Experimental results show that our method achieves 74.6 $AP^{\text{box}}$ and 74.2 $AP^{\text{mask}}$ on the dataset, outperforming existing state-of-the-art methods in both accuracy and robustness while maintaining low computational overhead. This work offers a deployable and efficient vision solution for automated freight train inspection, demonstrating the potential of foundation model adaptation in industrial-scale fault diagnosis scenarios. Project page: https://github.com/MVME-HBUT/SAM_FTI-FDet.git
IRNov 3, 2020Code
RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation AlgorithmsWayne Xin Zhao, Shanlei Mu, Yupeng Hou et al.
In recent years, there are a large number of recommendation algorithms proposed in the literature, from traditional collaborative filtering to deep learning algorithms. However, the concerns about how to standardize open source implementation of recommendation algorithms continually increase in the research community. In the light of this challenge, we propose a unified, comprehensive and efficient recommender system library called RecBole, which provides a unified framework to develop and reproduce recommendation algorithms for research purpose. In this library, we implement 73 recommendation models on 28 benchmark datasets, covering the categories of general recommendation, sequential recommendation, context-aware recommendation and knowledge-based recommendation. We implement the RecBole library based on PyTorch, which is one of the most popular deep learning frameworks. Our library is featured in many aspects, including general and extensible data structures, comprehensive benchmark models and datasets, efficient GPU-accelerated execution, and extensive and standard evaluation protocols. We provide a series of auxiliary functions, tools, and scripts to facilitate the use of this library, such as automatic parameter tuning and break-point resume. Such a framework is useful to standardize the implementation and evaluation of recommender systems. The project and documents are released at https://recbole.io/.
HCMay 18, 2021
3D Displays: Their Evolution, Inherent Challenges & Future PerspectivesXingyu Pan, Xuanhui Xu, Soumyabrata Dev et al.
The popularity of 3D displays has risen drastically over the past few decades but these displays are still merely a novelty compared to their true potential. The development has mostly focused on Head Mounted Displays (HMD) development for Virtual Reality and in general ignored non-HMD 3D displays. This is due to the inherent difficulty in the creation of these displays and their impracticability in general use due to cost, performance, and lack of meaningful use cases. In fairness to the hardware manufacturers who have made striking innovations in this field, there has been a dereliction of duty of software developers and researchers in terms of developing software to best utilize these displays. This paper will seek to identify what areas of future software development could mitigate this dereliction. To achieve this goal, the paper will first examine the current state of the art and perform a comparative analysis on different types of 3D displays, from this analysis a clear researcher gap exists in terms of software development for Light field displays which are the current state of the art of non-HMD-based 3D displays. The paper will then outline six distinct areas where the context-awareness concept will allow for non-HMD-based 3D displays in particular light field displays that can not only compete but surpass their HMD-based brethren for many specific use cases.