RailSafeNet: Visual Scene Understanding for Tram Safety
This addresses safety for pedestrians, drivers, cyclists, pets, and tram passengers in densely populated areas, but it is incremental as it combines existing methods like semantic segmentation and object detection with rule-based distance assessment.
The paper tackles tram safety by developing RailSafeNet, a real-time framework that uses monocular video to detect track intrusions and classify object risks, achieving 65% IoU for rail segmentation and 75.6% mAP for object detection on the RailSem19 dataset.
Tram-human interaction safety is an important challenge, given that trams frequently operate in densely populated areas, where collisions can range from minor injuries to fatal outcomes. This paper addresses the issue from the perspective of designing a solution leveraging digital image processing, deep learning, and artificial intelligence to improve the safety of pedestrians, drivers, cyclists, pets, and tram passengers. We present RailSafeNet, a real-time framework that fuses semantic segmentation, object detection and a rule-based Distance Assessor to highlight track intrusions. Using only monocular video, the system identifies rails, localises nearby objects and classifies their risk by comparing projected distances with the standard 1435mm rail gauge. Experiments on the diverse RailSem19 dataset show that a class-filtered SegFormer B3 model achieves 65% intersection-over-union (IoU), while a fine-tuned YOLOv8 attains 75.6% mean average precision (mAP) calculated at an intersection over union (IoU) threshold of 0.50. RailSafeNet therefore delivers accurate, annotation-light scene understanding that can warn drivers before dangerous situations escalate. Code available at https://github.com/oValach/RailSafeNet.