CVOct 12, 2025

Self-Supervised Multi-Scale Transformer with Attention-Guided Fusion for Efficient Crack Detection

Blessing Agyei Kyem, Joshua Kofi Asamoah, Eugene Denteh, Andrews Danyo, Armstrong Aboah

arXiv:2510.10378v18 citationsh-index: 13Autom Constr

Originality Incremental advance

AI Analysis

This enables scalable and cost-effective infrastructure monitoring for transportation agencies, representing a significant advance in self-supervised learning for this domain.

The paper tackled pavement crack detection without manual annotations by developing a self-supervised framework called Crack-Segmenter, which outperformed 13 supervised methods on ten datasets across metrics like mIoU and Dice score.

Pavement crack detection has long depended on costly and time-intensive pixel-level annotations, which limit its scalability for large-scale infrastructure monitoring. To overcome this barrier, this paper examines the feasibility of achieving effective pixel-level crack segmentation entirely without manual annotations. Building on this objective, a fully self-supervised framework, Crack-Segmenter, is developed, integrating three complementary modules: the Scale-Adaptive Embedder (SAE) for robust multi-scale feature extraction, the Directional Attention Transformer (DAT) for maintaining linear crack continuity, and the Attention-Guided Fusion (AGF) module for adaptive feature integration. Through evaluations on ten public datasets, Crack-Segmenter consistently outperforms 13 state-of-the-art supervised methods across all major metrics, including mean Intersection over Union (mIoU), Dice score, XOR, and Hausdorff Distance (HD). These findings demonstrate that annotation-free crack detection is not only feasible but also superior, enabling transportation agencies and infrastructure managers to conduct scalable and cost-effective monitoring. This work advances self-supervised learning and motivates pavement cracks detection research.

View on arXiv PDF

Similar