LiteVoxel: Low-memory Intelligent Thresholding for Efficient Voxel Rasterization
This addresses memory efficiency and stability issues in scene reconstruction for computer vision and graphics applications, though it is incremental as it builds on existing sparse-voxel rasterization methods.
The paper tackles the problem of sparse-voxel rasterization underfitting low-frequency content and inflating VRAM by introducing LiteVoxel, a self-tuning training pipeline that reduces peak VRAM by 40%-60% while maintaining comparable PSNR/SSIM, training time, and FPS.
Sparse-voxel rasterization is a fast, differentiable alternative for optimization-based scene reconstruction, but it tends to underfit low-frequency content, depends on brittle pruning heuristics, and can overgrow in ways that inflate VRAM. We introduce LiteVoxel, a self-tuning training pipeline that makes SV rasterization both steadier and lighter. Our loss is made low-frequency aware via an inverse-Sobel reweighting with a mid-training gamma-ramp, shifting gradient budget to flat regions only after geometry stabilize. Adaptation replaces fixed thresholds with a depth-quantile pruning logic on maximum blending weight, stabilized by EMA-hysteresis guards and refines structure through ray-footprint-based, priority-driven subdivision under an explicit growth budget. Ablations and full-system results across Mip-NeRF 360 (6scenes) and Tanks & Temples (3scenes) datasets show mitigation of errors in low-frequency regions and boundary instability while keeping PSNR/SSIM, training time, and FPS comparable to a strong SVRaster pipeline. Crucially, LiteVoxel reduces peak VRAM by ~40%-60% and preserves low-frequency detail that prior setups miss, enabling more predictable, memory-efficient training without sacrificing perceptual quality.