CVJul 23, 2022

Defining an action of SO(d)-rotations on images generated by projections of d-dimensional objects: Applications to pose inference with Geometric VAEs

arXiv:2207.11582v1h-index: 13
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of pose inference in computer vision and machine learning, particularly for applications involving 3D object analysis, but it is incremental as it focuses on clarifying theoretical limitations of existing methods.

The paper investigates the assumption that data lies on a subspace homeomorphic to a Lie group like SO(d) for images generated by projecting d-dimensional volumes with unknown pose, showing that defining a group action on the data space generally fails without specific geometric constraints on the volume. Experiments with geometric VAEs confirm that this constraint is crucial for accurate pose inference.

Recent advances in variational autoencoders (VAEs) have enabled learning latent manifolds as compact Lie groups, such as $SO(d)$. Since this approach assumes that data lies on a subspace that is homeomorphic to the Lie group itself, we here investigate how this assumption holds in the context of images that are generated by projecting a $d$-dimensional volume with unknown pose in $SO(d)$. Upon examining different theoretical candidates for the group and image space, we show that the attempt to define a group action on the data space generally fails, as it requires more specific geometric constraints on the volume. Using geometric VAEs, our experiments confirm that this constraint is key to proper pose inference, and we discuss the potential of these results for applications and future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes