Timo Teufel

h-index2

4papers

16citations

Novelty45%

AI Score39

Ranked #77,309 of 194,257 authors (top 40%)#26,174 in CV (top 44%)

4 Papers

14.0GRApr 30

D-Rex : Diffusion Rendering for Relightable Expressive Avatars

Timo Teufel, Xilong Zhou, Umar Iqbal et al.

We present D-Rex, a person-specific framework for photorealistic, relightable, expressive, and animatable full-body human avatars with free-viewpoint rendering. Existing methods for relightable full-body avatars rely on explicit 3D intrinsic decomposition with analytic reflectance models, which require accurate geometry registration and careful optimization to capture realistic light transport effects. This tight coupling of relighting with avatar modeling has hindered expressiveness: to our knowledge, no existing method demonstrates strong facial animation alongside relighting, limiting applicability in telepresence, gaming, and virtual production. We propose to decouple relighting entirely from avatar modeling by treating it as an image-space post-process: a learned translation from flat-lit, albedo-like renderings to a target HDR illumination. To this end, we leverage the strong generative prior of a pre-trained video diffusion relighting model, fine-tuned via LoRA on paired flat-lit and relit frames captured in a light stage. The flat-lit driving frames are produced by an independent expressive full-body avatar framework trained under white-light conditions, requiring no modification to support relighting, making D-Rex directly applicable to any white-light avatar system. We demonstrate that D-Rex enables view- and temporally consistent relighting while faithfully preserving expressive motion and fine-grained facial detail, outperforming physically-based relightable avatar baselines. Project page is https://vcai.mpi-inf.mpg.de/projects/DRex/

2.8CVOct 22, 2023

A Quantitative Evaluation of Dense 3D Reconstruction of Sinus Anatomy from Monocular Endoscopic Video

Jan Emily Mangulabnan, Roger D. Soberanis-Mukul, Timo Teufel et al.

Generating accurate 3D reconstructions from endoscopic video is a promising avenue for longitudinal radiation-free analysis of sinus anatomy and surgical outcomes. Several methods for monocular reconstruction have been proposed, yielding visually pleasant 3D anatomical structures by retrieving relative camera poses with structure-from-motion-type algorithms and fusion of monocular depth estimates. However, due to the complex properties of the underlying algorithms and endoscopic scenes, the reconstruction pipeline may perform poorly or fail unexpectedly. Further, acquiring medical data conveys additional challenges, presenting difficulties in quantitatively benchmarking these models, understanding failure cases, and identifying critical components that contribute to their precision. In this work, we perform a quantitative analysis of a self-supervised approach for sinus reconstruction using endoscopic sequences paired with optical tracking and high-resolution computed tomography acquired from nine ex-vivo specimens. Our results show that the generated reconstructions are in high agreement with the anatomy, yielding an average point-to-mesh error of 0.91 mm between reconstructions and CT segmentations. However, in a point-to-point matching scenario, relevant for endoscope tracking and navigation, we found average target registration errors of 6.58 mm. We identified that pose and depth estimation inaccuracies contribute equally to this error and that locally consistent sequences with shorter trajectories generate more accurate reconstructions. These results suggest that achieving global consistency between relative camera poses and estimated depths with the anatomy is essential. In doing so, we can ensure proper synergy between all components of the pipeline for improved reconstructions that will facilitate clinical application of this innovative technology.

8.4CVNov 4, 2025

OLATverse: A Large-scale Real-world Object Dataset with Precise Lighting Control

Xilong Zhou, Jianchun Chen, Pramod Rao et al.

We introduce OLATverse, a large-scale dataset comprising around 9M images of 765 real-world objects, captured from multiple viewpoints under a diverse set of precisely controlled lighting conditions. While recent advances in object-centric inverse rendering, novel view synthesis and relighting have shown promising results, most techniques still heavily rely on the synthetic datasets for training and small-scale real-world datasets for benchmarking, which limits their realism and generalization. To address this gap, OLATverse offers two key advantages over existing datasets: large-scale coverage of real objects and high-fidelity appearance under precisely controlled illuminations. Specifically, OLATverse contains 765 common and uncommon real-world objects, spanning a wide range of material categories. Each object is captured using 35 DSLR cameras and 331 individually controlled light sources, enabling the simulation of diverse illumination conditions. In addition, for each object, we provide well-calibrated camera parameters, accurate object masks, photometric surface normals, and diffuse albedo as auxiliary resources. We also construct an extensive evaluation set, establishing the first comprehensive real-world object-centric benchmark for inverse rendering and normal estimation. We believe that OLATverse represents a pivotal step toward integrating the next generation of inverse rendering and relighting methods with real-world data. The full dataset, along with all post-processing workflows, will be publicly released at https://vcai.mpi-inf.mpg.de/projects/OLATverse/.

3.7CVFeb 19, 2024

An Endoscopic Chisel: Intraoperative Imaging Carves 3D Anatomical Models

Jan Emily Mangulabnan, Roger D. Soberanis-Mukul, Timo Teufel et al.

Purpose: Preoperative imaging plays a pivotal role in sinus surgery where CTs offer patient-specific insights of complex anatomy, enabling real-time intraoperative navigation to complement endoscopy imaging. However, surgery elicits anatomical changes not represented in the preoperative model, generating an inaccurate basis for navigation during surgery progression. Methods: We propose a first vision-based approach to update the preoperative 3D anatomical model leveraging intraoperative endoscopic video for navigated sinus surgery where relative camera poses are known. We rely on comparisons of intraoperative monocular depth estimates and preoperative depth renders to identify modified regions. The new depths are integrated in these regions through volumetric fusion in a truncated signed distance function representation to generate an intraoperative 3D model that reflects tissue manipulation. Results: We quantitatively evaluate our approach by sequentially updating models for a five-step surgical progression in an ex vivo specimen. We compute the error between correspondences from the updated model and ground-truth intraoperative CT in the region of anatomical modification. The resulting models show a decrease in error during surgical progression as opposed to increasing when no update is employed. Conclusion: Our findings suggest that preoperative 3D anatomical models can be updated using intraoperative endoscopy video in navigated sinus surgery. Future work will investigate improvements to monocular depth estimation as well as removing the need for external navigation systems. The resulting ability to continuously update the patient model may provide surgeons with a more precise understanding of the current anatomical state and paves the way toward a digital twin paradigm for sinus surgery.