GRCVLGMay 28, 2025

RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination

Stanford
arXiv:2505.21925v118 citationsh-index: 30Has CodeSIGGRAPH
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient and accurate rendering for computer graphics applications, offering a novel approach that is incremental in its use of transformers for this domain.

The authors tackled the problem of neural rendering from triangle meshes with global illumination by introducing RenderFormer, a transformer-based pipeline that directly generates images without per-scene training, achieving competitive results on scenes of varying complexity.

We present RenderFormer, a neural rendering pipeline that directly renders an image from a triangle-based representation of a scene with full global illumination effects and that does not require per-scene training or fine-tuning. Instead of taking a physics-centric approach to rendering, we formulate rendering as a sequence-to-sequence transformation where a sequence of tokens representing triangles with reflectance properties is converted to a sequence of output tokens representing small patches of pixels. RenderFormer follows a two stage pipeline: a view-independent stage that models triangle-to-triangle light transport, and a view-dependent stage that transforms a token representing a bundle of rays to the corresponding pixel values guided by the triangle-sequence from the view-independent stage. Both stages are based on the transformer architecture and are learned with minimal prior constraints. We demonstrate and evaluate RenderFormer on scenes with varying complexity in shape and light transport.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes