WIPES: Wavelet-based Visual Primitives
This addresses the need for efficient and high-quality visual representations in 3D vision and graphics, offering an incremental improvement over existing methods.
The paper tackles the problem of achieving flexible frequency modulation and fast rendering in continuous visual representations by proposing WIPES, a wavelet-based visual primitive, which demonstrates higher rendering quality and faster inference than INR-based methods and better quality than Gaussian-based representations in tasks like 2D image representation and novel view synthesis.
Pursuing a continuous visual representation that offers flexible frequency modulation and fast rendering speed has recently garnered increasing attention in the fields of 3D vision and graphics. However, existing representations often rely on frequency guidance or complex neural network decoding, leading to spectrum loss or slow rendering. To address these limitations, we propose WIPES, a universal Wavelet-based vIsual PrimitivES for representing multi-dimensional visual signals. Building on the spatial-frequency localization advantages of wavelets, WIPES effectively captures both the low-frequency "forest" and the high-frequency "trees." Additionally, we develop a wavelet-based differentiable rasterizer to achieve fast visual rendering. Experimental results on various visual tasks, including 2D image representation, 5D static and 6D dynamic novel view synthesis, demonstrate that WIPES, as a visual primitive, offers higher rendering quality and faster inference than INR-based methods, and outperforms Gaussian-based representations in rendering quality.