FourierHandFlow: Neural 4D Hand Representation Using Fourier Query Flow
This work addresses the challenge of continuous 4D hand modeling for applications like motion interpolation and texture transfer, representing an incremental advance by integrating articulation priors and Fourier regularization into neural representations.
The paper tackles the problem of modeling 4D hand shapes from RGB videos by introducing FourierHandFlow, a representation that combines occupancy fields with Fourier series-based query flows to capture smooth temporal dynamics and implicit correspondences. It achieves state-of-the-art results in video-based 4D reconstruction and demonstrates efficiency improvements over existing methods.
Recent 4D shape representations model continuous temporal evolution of implicit shapes by (1) learning query flows without leveraging shape and articulation priors or (2) decoding shape occupancies separately for each time value. Thus, they do not effectively capture implicit correspondences between articulated shapes or regularize jittery temporal deformations. In this work, we present FourierHandFlow, which is a spatio-temporally continuous representation for human hands that combines a 3D occupancy field with articulation-aware query flows represented as Fourier series. Given an input RGB sequence, we aim to learn a fixed number of Fourier coefficients for each query flow to guarantee smooth and continuous temporal shape dynamics. To effectively model spatio-temporal deformations of articulated hands, we compose our 4D representation based on two types of Fourier query flow: (1) pose flow that models query dynamics influenced by hand articulation changes via implicit linear blend skinning and (2) shape flow that models query-wise displacement flow. In the experiments, our method achieves state-of-the-art results on video-based 4D reconstruction while being computationally more efficient than the existing 3D/4D implicit shape representations. We additionally show our results on motion inter- and extrapolation and texture transfer using the learned correspondences of implicit shapes. To the best of our knowledge, FourierHandFlow is the first neural 4D continuous hand representation learned from RGB videos. The code will be publicly accessible.