Samuel Scheidegger

CV
h-index5
6papers
186citations
Novelty42%
AI Score38

6 Papers

CVApr 17, 2022
A Pre-study on Data Processing Pipelines for Roadside Object Detection Systems Towards Safer Road Infrastructure

Yinan Yu, Samuel Scheidegger, John-Fredrik Grönvall et al.

Single-vehicle accidents are the most common type of fatal accidents in Sweden, where a car drives off the road and runs into hazardous roadside objects. Proper installation and maintenance of protective objects, such as crash cushions and guard rails, may reduce the chance and severity of such accidents. Moreover, efficient detection and management of hazardous roadside objects also plays an important role in improving road safety. To better understand the state-of-the-art and system requirements, in this pre-study, we investigate the feasibility, implementation, limitations and scaling up of data processing pipelines for roadside object detection. In particular, we divide our investigation into three parts: the target of interest, the sensors of choice and the algorithm design. The data sources we consider in this study cover two common setups: 1) road surveying fleet - annual scans conducted by Trafikverket, the Swedish Transport Administration, and 2) consumer vehicle - data collected using a research vehicle from the laboratory of Resource for vehicle research at Chalmers (REVERE). The goal of this report is to investigate how to implement a scalable roadside object detection system towards safe road infrastructure and Sweden's Vision Zero.

LGNov 19, 2025
PCARNN-DCBF: Minimal-Intervention Geofence Enforcement for Ground Vehicles

Yinan Yu, Samuel Scheidegger

Runtime geofencing for ground vehicles is rapidly emerging as a critical technology for enforcing Operational Design Domains (ODDs). However, existing solutions struggle to reconcile high-fidelity learning with the structural requirements of verifiable control. We address this by introducing PCARNN-DCBF, a novel pipeline integrating a Physics-encoded Control-Affine Residual Neural Network with a preview-based Discrete Control Barrier Function. Unlike generic learned models, PCARNN explicitly preserves the control-affine structure of vehicle dynamics, ensuring the linearity required for reliable optimization. This enables the DCBF to enforce polygonal keep-in constraints via a real-time Quadratic Program (QP) that handles high relative degree and mitigates actuator saturation. Experiments in CARLA across electric and combustion platforms demonstrate that this structure-preserving approach significantly outperforms analytical and unstructured neural baselines.

CVAug 6, 2025
Deep Learning-based Scalable Image-to-3D Facade Parser for Generating Thermal 3D Building Models

Yinan Yu, Alex Gonzalez-Caceres, Samuel Scheidegger et al.

Renovating existing buildings is essential for climate impact. Early-phase renovation planning requires simulations based on thermal 3D models at Level of Detail (LoD) 3, which include features like windows. However, scalable and accurate identification of such features remains a challenge. This paper presents the Scalable Image-to-3D Facade Parser (SI3FP), a pipeline that generates LoD3 thermal models by extracting geometries from images using both computer vision and deep learning. Unlike existing methods relying on segmentation and projection, SI3FP directly models geometric primitives in the orthographic image plane, providing a unified interface while reducing perspective distortions. SI3FP supports both sparse (e.g., Google Street View) and dense (e.g., hand-held camera) data sources. Tested on typical Swedish residential buildings, SI3FP achieved approximately 5% error in window-to-wall ratio estimates, demonstrating sufficient accuracy for early-stage renovation analysis. The pipeline facilitates large-scale energy renovation planning and has broader applications in urban development and planning.

LGOct 21, 2019
Building Efficient CNNs Using Depthwise Convolutional Eigen-Filters (DeCEF)

Yinan Yu, Samuel Scheidegger, Tomas McKelvey

Deep Convolutional Neural Networks (CNNs) have been widely used in various domains due to their impressive capabilities. These models are typically composed of a large number of 2D convolutional (Conv2D) layers with numerous trainable parameters. To reduce the complexity of a network, compression techniques can be applied. These methods typically rely on the analysis of trained deep learning models. However, in some applications, due to reasons such as particular data or system specifications and licensing restrictions, a pre-trained network may not be available. This would require the user to train a CNN from scratch. In this paper, we aim to find an alternative parameterization to Conv2D filters without relying on a pre-trained convolutional network. During the analysis, we observe that the effective rank of the vectorized Conv2D filters decreases with respect to the increasing depth in the network, which then leads to the implementation of the Depthwise Convolutional Eigen-Filter (DeCEF) layer. Essentially, a DeCEF layer is a low rank version of the Conv2D layer with significantly fewer trainable parameters and floating point operations (FLOPs). The way we define the effective rank is different from the previous work and it is easy to implement in any deep learning frameworks. To evaluate the effectiveness of DeCEF, experiments are conducted on the benchmark datasets CIFAR-10 and ImageNet using various network architectures. The results have shown a similar or higher accuracy and robustness using about 2/3 of the original parameters and reducing the number of FLOPs to 2/3 of the base network, which is then compared to the state-of-the-art techniques.

CVFeb 27, 2018
Mono-Camera 3D Multi-Object Tracking Using Deep Learning Detections and PMBM Filtering

Samuel Scheidegger, Joachim Benjaminsson, Emil Rosenberg et al.

Monocular cameras are one of the most commonly used sensors in the automotive industry for autonomous vehicles. One major drawback using a monocular camera is that it only makes observations in the two dimensional image plane and can not directly measure the distance to objects. In this paper, we aim at filling this gap by developing a multi-object tracking algorithm that takes an image as input and produces trajectories of detected objects in a world coordinate system. We solve this by using a deep neural network trained to detect and estimate the distance to objects from a single input image. The detections from a sequence of images are fed in to a state-of-the art Poisson multi-Bernoulli mixture tracking filter. The combination of the learned detector and the PMBM filter results in an algorithm that achieves 3D tracking using only mono-camera images as input. The performance of the algorithm is evaluated both in 3D world coordinates, and 2D image coordinates, using the publicly available KITTI object tracking dataset. The algorithm shows the ability to accurately track objects, correctly handle data associations, even when there is a big overlap of the objects in the image, and is one of the top performing algorithms on the KITTI object tracking benchmark. Furthermore, the algorithm is efficient, running on average close to 20 frames per second.

CVMar 10, 2017
Fast LIDAR-based Road Detection Using Fully Convolutional Neural Networks

Luca Caltagirone, Samuel Scheidegger, Lennart Svensson et al.

In this work, a deep learning approach has been developed to carry out road detection using only LIDAR data. Starting from an unstructured point cloud, top-view images encoding several basic statistics such as mean elevation and density are generated. By considering a top-view representation, road detection is reduced to a single-scale problem that can be addressed with a simple and fast fully convolutional neural network (FCN). The FCN is specifically designed for the task of pixel-wise semantic segmentation by combining a large receptive field with high-resolution feature maps. The proposed system achieved excellent performance and it is among the top-performing algorithms on the KITTI road benchmark. Its fast inference makes it particularly suitable for real-time applications.