Guanwen Zhang

CV
h-index8
3papers
2citations
Novelty50%
AI Score37

3 Papers

CVJun 27, 2025Code
Low-Rank Tensor Recovery via Variational Schatten-p Quasi-Norm and Jacobian Regularization

Zhengyun Cheng, Ruizhe Zhang, Guanwen Zhang et al.

Higher-order tensors are well-suited for representing multi-dimensional data, such as images and videos, which typically characterize low-rank structures. Low-rank tensor decomposition has become essential in machine learning and computer vision, but existing methods like Tucker decomposition offer flexibility at the expense of interpretability. The CANDECOMP/PARAFAC (CP) decomposition provides a natural and interpretable structure, while obtaining a sparse solutions remains challenging. Leveraging the rich properties of CP decomposition, we propose a CP-based low-rank tensor function parameterized by neural networks (NN) for implicit neural representation. This approach can model the tensor both on-grid and beyond grid, fully utilizing the non-linearity of NN with theoretical guarantees on excess risk bounds. To achieve sparser CP decomposition, we introduce a variational Schatten-p quasi-norm to prune redundant rank-1 components and prove that it serves as a common upper bound for the Schatten-p quasi-norms of arbitrary unfolding matrices. For smoothness, we propose a regularization term based on the spectral norm of the Jacobian and Hutchinson's trace estimator. The proposed smoothness regularization is SVD-free and avoids explicit chain rule derivations. It can serve as an alternative to Total Variation (TV) regularization in image denoising tasks and is naturally applicable to implicit neural representation. Extensive experiments on multi-dimensional data recovery tasks, including image inpainting, denoising, and point cloud upsampling, demonstrate the superiority and versatility of our method compared to state-of-the-art approaches. The code is available at https://github.com/CZY-Code/CP-Pruner.

LGJun 27, 2025
Score-Based Model for Low-Rank Tensor Recovery

Zhengyun Cheng, Changhao Wang, Guanwen Zhang et al.

Low-rank tensor decompositions (TDs) provide an effective framework for multiway data analysis. Traditional TD methods rely on predefined structural assumptions, such as CP or Tucker decompositions. From a probabilistic perspective, these can be viewed as using Dirac delta distributions to model the relationships between shared factors and the low-rank tensor. However, such prior knowledge is rarely available in practical scenarios, particularly regarding the optimal rank structure and contraction rules. The optimization procedures based on fixed contraction rules are complex, and approximations made during these processes often lead to accuracy loss. To address this issue, we propose a score-based model that eliminates the need for predefined structural or distributional assumptions, enabling the learning of compatibility between tensors and shared factors. Specifically, a neural network is designed to learn the energy function, which is optimized via score matching to capture the gradient of the joint log-probability of tensor entries and shared factors. Our method allows for modeling structures and distributions beyond the Dirac delta assumption. Moreover, integrating the block coordinate descent (BCD) algorithm with the proposed smooth regularization enables the model to perform both tensor completion and denoising. Experimental results demonstrate significant performance improvements across various tensor types, including sparse and continuous-time tensors, as well as visual data.

CVJan 23, 2025
PromptMono: Cross Prompting Attention for Self-Supervised Monocular Depth Estimation in Challenging Environments

Changhao Wang, Guanwen Zhang, Zhengyun Cheng et al.

Considerable efforts have been made to improve monocular depth estimation under ideal conditions. However, in challenging environments, monocular depth estimation still faces difficulties. In this paper, we introduce visual prompt learning for predicting depth across different environments within a unified model, and present a self-supervised learning framework called PromptMono. It employs a set of learnable parameters as visual prompts to capture domain-specific knowledge. To integrate prompting information into image representations, a novel gated cross prompting attention (GCPA) module is proposed, which enhances the depth estimation in diverse conditions. We evaluate the proposed PromptMono on the Oxford Robotcar dataset and the nuScenes dataset. Experimental results demonstrate the superior performance of the proposed method.