CVJan 14

Affostruction: 3D Affordance Grounding with Generative Reconstruction

arXiv:2601.09211v11 citationsh-index: 6
Originality Highly original
AI Analysis

This addresses the limitation of existing methods that only predict affordances on visible surfaces, enabling more accurate affordance prediction on complete shapes for robotics and human-computer interaction applications.

The paper tackles the problem of affordance grounding from RGBD images by localizing surface regions for actions on objects, proposing Affostruction to reconstruct complete geometry and ground affordances on full shapes, achieving a 40.4% improvement in affordance grounding and a 67.7% improvement in 3D reconstruction.

This paper addresses the problem of affordance grounding from RGBD images of an object, which aims to localize surface regions corresponding to a text query that describes an action on the object. While existing methods predict affordance regions only on visible surfaces, we propose Affostruction, a generative framework that reconstructs complete geometry from partial observations and grounds affordances on the full shape including unobserved regions. We make three core contributions: generative multi-view reconstruction via sparse voxel fusion that extrapolates unseen geometry while maintaining constant token complexity, flow-based affordance grounding that captures inherent ambiguity in affordance distributions, and affordance-driven active view selection that leverages predicted affordances for intelligent viewpoint sampling. Affostruction achieves 19.1 aIoU on affordance grounding (40.4\% improvement) and 32.67 IoU for 3D reconstruction (67.7\% improvement), enabling accurate affordance prediction on complete shapes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes