Josef Pauli

CV
5papers
878citations
Novelty39%
AI Score28

5 Papers

CVJun 22, 2022
Efficient fine-grained road segmentation using superpixel-based CNN and CRF models

Farnoush Zohourian, Jan Siegemund, Mirko Meuter et al.

Towards a safe and comfortable driving, road scene segmentation is a rudimentary problem in camera-based advance driver assistance systems (ADAS). Despite of the great achievement of Convolutional Neural Networks (CNN) for semantic segmentation task, the high computational efforts of CNN based methods is still a challenging area. In recent work, we proposed a novel approach to utilise the advantages of CNNs for the task of road segmentation at reasonable computational effort. The runtime benefits from using irregular super pixels as basis for the input for the CNN rather than the image grid, which tremendously reduces the input size. Although, this method achieved remarkable low computational time in both training and testing phases, the lower resolution of the super pixel domain yields naturally lower accuracy compared to high cost state of the art methods. In this work, we focus on a refinement of the road segmentation utilising a Conditional Random Field (CRF).The refinement procedure is limited to the super pixels touching the predicted road boundary to keep the additional computational effort low. Reducing the input to the super pixel domain allows the CNNs structure to stay small and efficient to compute while keeping the advantage of convolutional layers and makes them eligible for ADAS. Applying CRF compensate the trade off between accuracy and computational efficiency. The proposed system obtained comparable performance among the top performing algorithms on the KITTI road benchmark and its fast inference makes it particularly suitable for realtime applications.

ROAug 7, 2024
Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning

Martin Moder, Stephen Adhisaputra, Josef Pauli

This paper addresses navigation in crowded environments by integrating goal-conditioned generative models with Sampling-based Model Predictive Control (SMPC). We introduce goal-conditioned autoregressive models to generate crowd behaviors, capturing intricate interactions among individuals. The model processes potential robot trajectory samples and predicts the reactions of surrounding individuals, enabling proactive robotic navigation in complex scenarios. Extensive experiments show that this algorithm enables real-time navigation, significantly reducing collision rates and path lengths, and outperforming selected baseline methods. The practical effectiveness of this algorithm is validated on an actual robotic platform, demonstrating its capability in dynamic settings.

IVJan 17, 2020
CHAOS Challenge -- Combined (CT-MR) Healthy Abdominal Organ Segmentation

A. Emre Kavur, N. Sinem Gezer, Mustafa Barış et al.

Segmentation of abdominal organs has been a comprehensive, yet unresolved, research field for many years. In the last decade, intensive developments in deep learning (DL) have introduced new state-of-the-art segmentation systems. In order to expand the knowledge on these topics, the CHAOS - Combined (CT-MR) Healthy Abdominal Organ Segmentation challenge has been organized in conjunction with IEEE International Symposium on Biomedical Imaging (ISBI), 2019, in Venice, Italy. CHAOS provides both abdominal CT and MR data from healthy subjects for single and multiple abdominal organ segmentation. Five different but complementary tasks have been designed to analyze the capabilities of current approaches from multiple perspectives. The results are investigated thoroughly, compared with manual annotations and interactive methods. The analysis shows that the performance of DL models for single modality (CT / MR) can show reliable volumetric analysis performance (DICE: 0.98 $\pm$ 0.00 / 0.95 $\pm$ 0.01) but the best MSSD performance remain limited (21.89 $\pm$ 13.94 / 20.85 $\pm$ 10.63 mm). The performances of participating models decrease significantly for cross-modality tasks for the liver (DICE: 0.88 $\pm$ 0.15 MSSD: 36.33 $\pm$ 21.97 mm) and all organs (DICE: 0.85 $\pm$ 0.21 MSSD: 33.17 $\pm$ 38.93 mm). Despite contrary examples on different applications, multi-tasking DL models designed to segment all organs seem to perform worse compared to organ-specific ones (performance drop around 5\%). Besides, such directions of further research for cross-modality segmentation would significantly support real-world clinical applications. Moreover, having more than 1500 participants, another important contribution of the paper is the analysis on shortcomings of challenge organizations such as the effects of multiple submissions and peeking phenomena.

CVMar 21, 2019
Deep Learning with Anatomical Priors: Imitating Enhanced Autoencoders in Latent Space for Improved Pelvic Bone Segmentation in MRI

Duc Duy Pham, Gurbandurdy Dovletov, Sebastian Warwas et al.

We propose a 2D Encoder-Decoder based deep learning architecture for semantic segmentation, that incorporates anatomical priors by imitating the encoder component of an autoencoder in latent space. The autoencoder is additionally enhanced by means of hierarchical features, extracted by an U-Net module. Our suggested architecture is trained in an end-to-end manner and is evaluated on the example of pelvic bone segmentation in MRI. A comparison to the standard U-Net architecture shows promising improvements.

AIOct 2, 2015
Online Vision- and Action-Based Object Classification Using Both Symbolic and Subsymbolic Knowledge Representations

Laura Steinert, Jens Hoefinghoff, Josef Pauli

If a robot is supposed to roam an environment and interact with objects, it is often necessary to know all possible objects in advance, so that a database with models of all objects can be generated for visual identification. However, this constraint cannot always be fulfilled. Due to that reason, a model based object recognition cannot be used to guide the robot's interactions. Therefore, this paper proposes a system that analyzes features of encountered objects and then uses these features to compare unknown objects to already known ones. From the resulting similarity appropriate actions can be derived. Moreover, the system enables the robot to learn object categories by grouping similar objects or by splitting existing categories. To represent the knowledge a hybrid form is used, consisting of both symbolic and subsymbolic representations.