CVAIFeb 11, 2023

TPE-Net: Track Point Extraction and Association Network for Rail Path Proposal Generation

arXiv:2302.05803v18 citationsh-index: 27
Originality Incremental advance
AI Analysis

This addresses the need for accurate rail route identification in autonomous train systems to minimize collision risks, though it is incremental as it builds on existing image-based methods without surpassing state-of-the-art performance.

The paper tackles the problem of extracting rail path instances from images for autonomous trains by proposing TPE-Net, a fully convolutional network that regresses center points and left-right rail pixels, achieving true-positive-pixel average precision of 0.9207 and recall of 0.8721 at about 12 frames per second.

One essential feature of an autonomous train is minimizing collision risks with third-party objects. To estimate the risk, the control system must identify topological information of all the rail routes ahead on which the train can possibly move, especially within merging or diverging rails. This way, the train can figure out the status of potential obstacles with respect to its route and hence, make a timely decision. Numerous studies have successfully extracted all rail tracks as a whole within forward-looking images without considering element instances. Still, some image-based methods have employed hard-coded prior knowledge of railway geometry on 3D data to associate left-right rails and generate rail route instances. However, we propose a rail path extraction pipeline in which left-right rail pixels of each rail route instance are extracted and associated through a fully convolutional encoder-decoder architecture called TPE-Net. Two different regression branches for TPE-Net are proposed to regress the locations of center points of each rail route, along with their corresponding left-right pixels. Extracted rail pixels are then spatially clustered to generate topological information of all the possible train routes (ego-paths), discarding non-ego-path ones. Experimental results on a challenging, publicly released benchmark show true-positive-pixel level average precision and recall of 0.9207 and 0.8721, respectively, at about 12 frames per second. Even though our evaluation results are not higher than the SOTA, the proposed regression pipeline performs remarkably in extracting the correspondences by looking once at the image. It generates strong rail route hypotheses without reliance on camera parameters, 3D data, and geometrical constraints.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes