Raphael Frank

CV
h-index5
5papers
10citations
Novelty46%
AI Score41

5 Papers

CVSep 16, 2024Code
DENSER: 3D Gaussians Splatting for Scene Reconstruction of Dynamic Urban Environments

Mahmud A. Mohamad, Gamal Elghazaly, Arthur Hubert et al.

This paper presents DENSER, an efficient and effective approach leveraging 3D Gaussian splatting (3DGS) for the reconstruction of dynamic urban environments. While several methods for photorealistic scene representations, both implicitly using neural radiance fields (NeRF) and explicitly using 3DGS have shown promising results in scene reconstruction of relatively complex dynamic scenes, modeling the dynamic appearance of foreground objects tend to be challenging, limiting the applicability of these methods to capture subtleties and details of the scenes, especially far dynamic objects. To this end, we propose DENSER, a framework that significantly enhances the representation of dynamic objects and accurately models the appearance of dynamic objects in the driving scene. Instead of directly using Spherical Harmonics (SH) to model the appearance of dynamic objects, we introduce and integrate a new method aiming at dynamically estimating SH bases using wavelets, resulting in better representation of dynamic objects appearance in both space and time. Besides object appearance, DENSER enhances object shape representation through densification of its point cloud across multiple scene frames, resulting in faster convergence of model training. Extensive evaluations on KITTI dataset show that the proposed approach significantly outperforms state-of-the-art methods by a wide margin. Source codes and models will be uploaded to this repository https://github.com/sntubix/denser

AIMay 2
MILD: Mediator Agent System with Bidirectional Perception and Multi-Layered Alignment for Human-Vehicle Collaboration

Jiyao Wang, Yunbiao Wang, Yubo Jiao et al.

Prior studies report that partial driving automation can increase the cognitive demands on human drivers. This effect largely arises from human drivers' lack of transparent insight into the vehicle's intentions and decision logic, as well as from automated systems' limited awareness of the driver's dynamic state and preferences. This bidirectional misalignment undermines shared situational awareness and exacerbates coordination failures in human-vehicle interaction. To address these limitations, we argue for a paradigm shift that elevates the human role from passive supervisor to active manager. We introduce the Mediator-in-the-Loop-Driving (MILD) system, based on an agentic system architecture to facilitate synergistic human-vehicle collaboration. MILD integrates a perception agent for joint in-cabin and out-of-cabin understanding with a lightweight strategy agent that generates compliant and explainable action suggestions. To ensure these strategies are strictly aligned with safety regulations and human values, we develop Evidence- and Constraint-weighted Policy Optimization (ECPO). ECPO leverages automatic validators to steer the agent toward behaviors that are not only accurate but also structurally complete, substantiated by evidence, and free from constraint violations. Furthermore, a retrieval-augmented generation module dynamically incorporates constraints from traffic regulations, speed recommendations, and driver preferences into the decision loop. Field experiments across three open datasets demonstrate that MILD consistently outperforms baselines in both perception accuracy and strategy quality under auditable offline metrics, and yields higher human-rated policy adequacy, comfort, and explanation than baselines. This work offers a practical pathway for building auditable and aligned agents for human-vehicle collaborative driving.

CYMar 19
Agentic Vehicles for Human-Centered Mobility: Definition, Prospects, and System Implications

Jiangbo Yu, Raphael Frank, Luis Miranda-Moreno et al.

Autonomy, from the Greek autos (self) and nomos (law), refers to the capacity to operate according to internal rules without external control. Autonomous vehicles (AuVs) are therefore understood as systems that perceive their environment and execute pre-programmed tasks independently of external input, consistent with the SAE levels of automated driving. Yet recent research and real-world deployments have begun to showcase vehicles that exhibit behaviors outside the scope of this definition. These include natural language interaction with humans, goal adaptation, contextual reasoning, external tool use, and the handling of unforeseen ethical dilemmas, enabled in part by multimodal large language models (LLMs). These developments highlight not only a gap between technical autonomy and the broader cognitive and social capacities required for human-centered mobility, but also the emergence of a form of vehicle intelligence that currently lacks a clear designation. To address this gap, the paper introduces the concept of agentic vehicles (AgVs): vehicles that exhibit agency, the capacity for goal-driven reasoning, strategic adaptation, self-reflection, and purposeful engagement with complex environments. We conclude by outlining key challenges in the development and governance of AgVs and their potential role in shaping future agentic transportation systems that align with user and societal needs.

CVFeb 14, 2025
Leveraging V2X for Collaborative HD Maps Construction Using Scene Graph Generation

Gamal Elghazaly, Raphael Frank

High-Definition (HD) maps play a crucial role in autonomous vehicle navigation, complementing onboard perception sensors for improved accuracy and safety. Traditional HD map generation relies on dedicated mapping vehicles, which are costly and fail to capture real-time infrastructure changes. This paper presents HDMapLaneNet, a novel framework leveraging V2X communication and Scene Graph Generation to collaboratively construct a localized geometric layer of HD maps. The approach extracts lane centerlines from front-facing camera images, represents them as graphs, and transmits the data for global aggregation to the cloud via V2X. Preliminary results on the nuScenes dataset demonstrate superior association prediction performance compared to a state-of-the-art method.

CVDec 23, 2024
Editing Implicit and Explicit Representations of Radiance Fields: A Survey

Arthur Hubert, Gamal Elghazaly, Raphael Frank

Neural Radiance Fields (NeRF) revolutionized novel view synthesis in recent years by offering a new volumetric representation, which is compact and provides high-quality image rendering. However, the methods to edit those radiance fields developed slower than the many improvements to other aspects of NeRF. With the recent development of alternative radiance field-based representations inspired by NeRF as well as the worldwide rise in popularity of text-to-image models, many new opportunities and strategies have emerged to provide radiance field editing. In this paper, we deliver a comprehensive survey of the different editing methods present in the literature for NeRF and other similar radiance field representations. We propose a new taxonomy for classifying existing works based on their editing methodologies, review pioneering models, reflect on current and potential new applications of radiance field editing, and compare state-of-the-art approaches in terms of editing options and performance.