Guoquan Zheng

h-index27
2papers

2 Papers

58.8CVApr 8
Robust Mesh Saliency Ground Truth Acquisition in VR via View Cone Sampling and Manifold Diffusion

Guoquan Zheng, Jie Hao, Huiyu Duan et al.

As the complexity of 3D digital content grows exponentially, understanding human visual attention is critical for optimizing rendering and processing resources. Therefore, reliable 3D mesh saliency ground truth (GT) is essential for human-centric visual modeling in virtual reality (VR). However, existing VR eye-tracking frameworks are fundamentally bottlenecked by their underlying acquisition and generation mechanisms. The reliance on zero-area single ray sampling (SRS) fails to capture contextual features, leading to severe texture aliasing and discontinuous saliency signals. And the conventional application of Euclidean smoothing propagates saliency across disconnected physical gaps, resulting in semantic confusion on complex 3D manifolds. This paper proposes a robust framework to address these limitations. We first introduce a view cone sampling (VCS) strategy, which simulates the human foveal receptive field via Gaussian-distributed ray bundles to improve sampling robustness for complex topologies. Furthermore, a hybrid Manifold-Euclidean constrained diffusion (HCD) algorithm is developed, fusing manifold geodesic constraints with Euclidean scales to ensure topologically-consistent saliency propagation. We demonstrate the improvement in performance over baseline methods and the benefits for downstream tasks through subjective experiments and qualitative and quantitative methods. By mitigating "topological short-circuits" and aliasing, our framework provides a high-fidelity 3D attention acquisition paradigm that aligns with natural human perception, offering a more accurate and robust baseline for 3D mesh saliency research.

CVDec 25, 2024Code
Embodied Image Quality Assessment for Robotic Intelligence

Jianbo Zhang, Chunyi Li, Jie Hao et al.

Image Quality Assessment (IQA) of User-Generated Content (UGC) is a critical technique for human Quality of Experience (QoE). However, does the the image quality of Robot-Generated Content (RGC) demonstrate traits consistent with the Moravec paradox, potentially conflicting with human perceptual norms? Human subjective scoring is more based on the attractiveness of the image. Embodied agent are required to interact and perceive in the environment, and finally perform specific tasks. Visual images as inputs directly influence downstream tasks. In this paper, we explore the perception mechanism of embodied robots for image quality. We propose the first Embodied Preference Database (EPD), which contains 12,500 distorted image annotations. We establish assessment metrics based on the downstream tasks of robot. In addition, there is a gap between UGC and RGC. To address this, we propose a novel Multi-scale Attention Embodied Image Quality Assessment called MA-EIQA. For the proposed EPD dataset, this is the first no-reference IQA model designed for embodied robot. Finally, the performance of mainstream IQA algorithms on EPD dataset is verified. The experiments demonstrate that quality assessment of embodied images is different from that of humans. We sincerely hope that the EPD can contribute to the development of embodied AI by focusing on image quality assessment. The benchmark is available at https://github.com/Jianbo-maker/EPD_benchmark.