CVLGApr 5

TORA: Topological Representation Alignment for 3D Shape Assembly

arXiv:2604.0405053.91 citations
AI Analysis

This work addresses the challenge of efficient and accurate 3D shape assembly for applications in robotics and computer vision, representing an incremental improvement over existing flow-matching methods.

The paper tackles the problem of 3D shape assembly by introducing TORA, a framework that aligns topological representations from a pretrained 3D encoder into a flow-matching backbone, resulting in faster convergence (up to 6.9×) and improved accuracy with robustness under domain shift, achieving state-of-the-art performance on five benchmarks.

Flow-matching methods for 3D shape assembly learn point-wise velocity fields that transport parts toward assembled configurations, yet they receive no explicit guidance about which cross-part interactions should drive the motion. We introduce TORA, a topology-first representation alignment framework that distills relational structure from a frozen pretrained 3D encoder into the flow-matching backbone during training. We first realize this via simple instantiation, token-wise cosine matching, which injects the learned geometric descriptors from the teacher representation. We then extend to employ a Centered Kernel Alignment (CKA) loss to match the similarity structure between student and teacher representations for enhanced topological alignment. Through systematic probing of diverse 3D encoders, we show that geometry- and contact-centric teacher properties, not semantic classification ability, govern alignment effectiveness, and that alignment is most beneficial at later transformer layers where spatial structure naturally emerges. TORA introduces zero inference overhead while yielding two consistent benefits: faster convergence (up to 6.9$\times$) and improved accuracy in-distribution, along with greater robustness under domain shift. Experiments on five benchmarks spanning geometric, semantic, and inter-object assembly demonstrate state-of-the-art performance, with particularly pronounced gains in zero-shot transfer to unseen real-world and synthetic datasets. Project page: https://nahyuklee.github.io/tora.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes