Susana Castillo

CV
h-index20
8papers
67citations
Novelty43%
AI Score40

8 Papers

CVDec 2, 2022
Fast Non-Rigid Radiance Fields from Monocularized Data

Moritz Kappel, Vladislav Golyanik, Susana Castillo et al.

The reconstruction and novel view synthesis of dynamic scenes recently gained increased attention. As reconstruction from large-scale multi-view data involves immense memory and computational requirements, recent benchmark datasets provide collections of single monocular views per timestamp sampled from multiple (virtual) cameras. We refer to this form of inputs as "monocularized" data. Existing work shows impressive results for synthetic setups and forward-facing real-world data, but is often limited in the training speed and angular range for generating novel views. This paper addresses these limitations and proposes a new method for full 360° inward-facing novel view synthesis of non-rigidly deforming scenes. At the core of our method are: 1) An efficient deformation module that decouples the processing of spatial and temporal information for accelerated training and inference; and 2) A static module representing the canonical scene as a fast hash-encoded neural radiance field. In addition to existing synthetic monocularized data, we systematically analyze the performance on real-world inward-facing scenes using a newly recorded challenging dataset sampled from a synchronized large-scale multi-view rig. In both cases, our method is significantly faster than previous methods, converging in less than 7 minutes and achieving real-time framerates at 1K resolution, while obtaining a higher visual accuracy for generated novel views. Our source code and data is available at our project page https://graphics.tu-bs.de/publications/kappel2022fast.

LGDec 19, 2025Code
Estimating Spatially Resolved Radiation Fields Using Neural Networks

Felix Lehner, Pasquale Lombardo, Susana Castillo et al.

We present an in-depth analysis on how to build and train neural networks to estimate the spatial distribution of scattered radiation fields for radiation protection dosimetry in medical radiation fields, such as those found in interventional radiology and cardiology. We present three different synthetically generated datasets with increasing complexity for training, using a Monte-Carlo Simulation application based on Geant4. On those datasets, we evaluate convolutional and fully connected architectures of neural networks to demonstrate which design decisions work well for reconstructing the fluence and spectra distributions over the spatial domain of such radiation fields. All our datasets, as well as our training pipeline, are published as open source in separate repositories.

LGDec 18, 2024Code
RadField3D: A Data Generator and Data Format for Deep Learning in Radiation-Protection Dosimetry for Medical Applications

Felix Lehner, Pasquale Lombardo, Susana Castillo et al.

In this research work, we present our open-source Geant4-based Monte-Carlo simulation application, called RadField3D, for generating threedimensional radiation field datasets for dosimetry. Accompanying, we introduce a fast, machine-interpretable data format with a Python API for easy integration into neural network research, that we call RadFiled3D. Both developments are intended to be used to research alternative radiation simulation methods using deep learning.

CVMar 25, 2024
INPC: Implicit Neural Point Clouds for Radiance Field Rendering

Florian Hahlbohm, Linus Franke, Moritz Kappel et al.

We introduce a new approach for reconstruction and novel view synthesis of unbounded real-world scenes. In contrast to previous methods using either volumetric fields, grid-based models, or discrete point cloud proxies, we propose a hybrid scene representation, which implicitly encodes the geometry in a continuous octree-based probability field and view-dependent appearance in a multi-resolution hash grid. This allows for extraction of arbitrary explicit point clouds, which can be rendered using rasterization. In doing so, we combine the benefits of both worlds and retain favorable behavior during optimization: Our novel implicit point cloud representation and differentiable bilinear rasterizer enable fast rendering while preserving the fine geometric detail captured by volumetric neural fields. Furthermore, this representation does not depend on priors like structure-from-motion point clouds. Our method achieves state-of-the-art image quality on common benchmarks. Furthermore, we achieve fast inference at interactive frame rates, and can convert our trained model into a large, explicit point cloud to further enhance performance.

GRAug 26, 2025
A Bag of Tricks for Efficient Implicit Neural Point Clouds

Florian Hahlbohm, Linus Franke, Leon Overkämping et al.

Implicit Neural Point Cloud (INPC) is a recent hybrid representation that combines the expressiveness of neural fields with the efficiency of point-based rendering, achieving state-of-the-art image quality in novel view synthesis. However, as with other high-quality approaches that query neural networks during rendering, the practical usability of INPC is limited by comparatively slow rendering. In this work, we present a collection of optimizations that significantly improve both the training and inference performance of INPC without sacrificing visual fidelity. The most significant modifications are an improved rasterizer implementation, more effective sampling techniques, and the incorporation of pre-training for the convolutional neural network used for hole-filling. Furthermore, we demonstrate that points can be modeled as small Gaussians during inference to further improve quality in extrapolated, e.g., close-up views of the scene. We design our implementations to be broadly applicable beyond INPC and systematically evaluate each modification in a series of experiments. Our optimized INPC pipeline achieves up to 25% faster training, 2x faster rendering, and 20% reduced VRAM usage paired with slight image quality improvements.

CVJun 14, 2024
D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video

Moritz Kappel, Florian Hahlbohm, Timon Scholz et al.

Dynamic reconstruction and spatiotemporal novel-view synthesis of non-rigidly deforming scenes recently gained increased attention. While existing work achieves impressive quality and performance on multi-view or teleporting camera setups, most methods fail to efficiently and faithfully recover motion and appearance from casual monocular captures. This paper contributes to the field by introducing a new method for dynamic novel view synthesis from monocular video, such as casual smartphone captures. Our approach represents the scene as a $\textit{dynamic neural point cloud}$, an implicit time-conditioned point distribution that encodes local geometry and appearance in separate hash-encoded neural feature grids for static and dynamic regions. By sampling a discrete point cloud from our model, we can efficiently render high-quality novel views using a fast differentiable rasterizer and neural rendering network. Similar to recent work, we leverage advances in neural scene analysis by incorporating data-driven priors like monocular depth estimation and object segmentation to resolve motion and depth ambiguities originating from the monocular captures. In addition to guiding the optimization process, we show that these priors can be exploited to explicitly initialize our scene representation to drastically improve optimization speed and final image quality. As evidenced by our experimental evaluation, our dynamic point cloud model not only enables fast optimization and real-time frame rates for interactive applications, but also achieves competitive image quality on monocular benchmark sequences. Our code and data are available online: https://moritzkappel.github.io/projects/dnpc/.

CVDec 13, 2021
N-SfC: Robust and Fast Shape Estimation from Caustic Images

Marc Kassubeck, Moritz Kappel, Susana Castillo et al.

This paper deals with the highly challenging problem of reconstructing the shape of a refracting object from a single image of its resulting caustic. Due to the ubiquity of transparent refracting objects in everyday life, reconstruction of their shape entails a multitude of practical applications. The recent Shape from Caustics (SfC) method casts the problem as the inverse of a light propagation simulation for synthesis of the caustic image, that can be solved by a differentiable renderer. However, the inherent complexity of light transport through refracting surfaces currently limits the practicability with respect to reconstruction speed and robustness. To address these issues, we introduce Neural-Shape from Caustics (N-SfC), a learning-based extension that incorporates two components into the reconstruction pipeline: a denoising module, which alleviates the computational cost of the light transport simulation, and an optimization process based on learned gradient descent, which enables better convergence using fewer iterations. Extensive experiments demonstrate the effectiveness of our neural extensions in the scenario of quality control in 3D glass printing, where we significantly outperform the current state-of-the-art in terms of computational speed and final surface error.

CVDec 20, 2020
High-Fidelity Neural Human Motion Transfer from Monocular Video

Moritz Kappel, Vladislav Golyanik, Mohamed Elgharib et al.

Video-based human motion transfer creates video animations of humans following a source motion. Current methods show remarkable results for tightly-clad subjects. However, the lack of temporally consistent handling of plausible clothing dynamics, including fine and high-frequency details, significantly limits the attainable visual quality. We address these limitations for the first time in the literature and present a new framework which performs high-fidelity and temporally-consistent human motion transfer with natural pose-dependent non-rigid deformations, for several types of loose garments. In contrast to the previous techniques, we perform image generation in three subsequent stages, synthesizing human shape, structure, and appearance. Given a monocular RGB video of an actor, we train a stack of recurrent deep neural networks that generate these intermediate representations from 2D poses and their temporal derivatives. Splitting the difficult motion transfer problem into subtasks that are aware of the temporal motion context helps us to synthesize results with plausible dynamics and pose-dependent detail. It also allows artistic control of results by manipulation of individual framework stages. In the experimental results, we significantly outperform the state-of-the-art in terms of video realism. Our code and data will be made publicly available.