QUANT-PHDec 14, 2025
Quantum Implicit Neural Representations for 3D Scene Reconstruction and Novel View SynthesisYeray Cordero, Paula García-Molina, Fernando Vilariño
Implicit neural representations (INRs) have become a powerful paradigm for continuous signal modeling and 3D scene reconstruction, yet classical networks suffer from a well-known spectral bias that limits their ability to capture high-frequency details. Quantum Implicit Representation Networks (QIREN) mitigate this limitation by employing parameterized quantum circuits with inherent Fourier structures, enabling compact and expressive frequency modeling beyond classical MLPs. In this paper, we present Quantum Neural Radiance Fields (Q-NeRF), the first hybrid quantum-classical framework for neural radiance field rendering. Q-NeRF integrates QIREN modules into the Nerfacto backbone, preserving its efficient sampling, pose refinement, and volumetric rendering strategies while replacing selected density and radiance prediction components with quantum-enhanced counterparts. We systematically evaluate three hybrid configurations on standard multi-view indoor datasets, comparing them to classical baselines using PSNR, SSIM, and LPIPS metrics. Results show that hybrid quantum-classical models achieve competitive reconstruction quality under limited computational resources, with quantum modules particularly effective in representing fine-scale, view-dependent appearance. Although current implementations rely on quantum circuit simulators constrained to few-qubit regimes, the results highlight the potential of quantum encodings to alleviate spectral bias in implicit representations. Q-NeRF provides a foundational step toward scalable quantum-enabled 3D scene reconstruction and a baseline for future quantum neural rendering research.
CVAug 28, 2025
CraftGraffiti: Exploring Human Identity with Custom Graffiti Art via Facial-Preserving Diffusion ModelsAyan Banerjee, Fernando Vilariño, Josep Lladós
Preserving facial identity under extreme stylistic transformation remains a major challenge in generative art. In graffiti, a high-contrast, abstract medium, subtle distortions to the eyes, nose, or mouth can erase the subject's recognizability, undermining both personal and cultural authenticity. We present CraftGraffiti, an end-to-end text-guided graffiti generation framework designed with facial feature preservation as a primary objective. Given an input image and a style and pose descriptive prompt, CraftGraffiti first applies graffiti style transfer via LoRA-fine-tuned pretrained diffusion transformer, then enforces identity fidelity through a face-consistent self-attention mechanism that augments attention layers with explicit identity embeddings. Pose customization is achieved without keypoints, using CLIP-guided prompt extension to enable dynamic re-posing while retaining facial coherence. We formally justify and empirically validate the "style-first, identity-after" paradigm, showing it reduces attribute drift compared to the reverse order. Quantitative results demonstrate competitive facial feature consistency and state-of-the-art aesthetic and human preference scores, while qualitative analyses and a live deployment at the Cruilla Festival highlight the system's real-world creative impact. CraftGraffiti advances the goal of identity-respectful AI-assisted artistry, offering a principled approach for blending stylistic freedom with recognizability in creative AI applications.