Line Segment Detection Using Transformers without Edges
This work addresses the problem of efficient and accurate line segment detection for computer vision researchers, offering an incremental improvement by simplifying the pipeline.
This paper introduces LinE segment TRansformers (LETR), an end-to-end line segment detection algorithm that uses Transformers without relying on edge detection or other intermediate processing. LETR achieves state-of-the-art results on the Wireframe and YorkUrban benchmarks by directly predicting line segments using a multi-scale encoder/decoder and an endpoint distance loss.
In this paper, we present a joint end-to-end line segment detection algorithm using Transformers that is post-processing and heuristics-guided intermediate processing (edge/junction/region detection) free. Our method, named LinE segment TRansformers (LETR), takes advantages of having integrated tokenized queries, a self-attention mechanism, and an encoding-decoding strategy within Transformers by skipping standard heuristic designs for the edge element detection and perceptual grouping processes. We equip Transformers with a multi-scale encoder/decoder strategy to perform fine-grained line segment detection under a direct endpoint distance loss. This loss term is particularly suitable for detecting geometric structures such as line segments that are not conveniently represented by the standard bounding box representations. The Transformers learn to gradually refine line segments through layers of self-attention. In our experiments, we show state-of-the-art results on Wireframe and YorkUrban benchmarks.