Jihe Li

CV
h-index6
3papers
13citations
Novelty62%
AI Score28

3 Papers

CVApr 29, 2024
PKU-AIGIQA-4K: A Perceptual Quality Assessment Database for Both Text-to-Image and Image-to-Image AI-Generated Images

Jiquan Yuan, Fanyi Yang, Jihe Li et al.

In recent years, image generation technology has rapidly advanced, resulting in the creation of a vast array of AI-generated images (AIGIs). However, the quality of these AIGIs is highly inconsistent, with low-quality AIGIs severely impairing the visual experience of users. Due to the widespread application of AIGIs, the AI-generated image quality assessment (AIGIQA), aimed at evaluating the quality of AIGIs from the perspective of human perception, has garnered increasing interest among scholars. Nonetheless, current research has not yet fully explored this field. We have observed that existing databases are limited to images generated from single scenario settings. Databases such as AGIQA-1K, AGIQA-3K, and AIGCIQA2023, for example, only include images generated by text-to-image generative models. This oversight highlights a critical gap in the current research landscape, underscoring the need for dedicated databases catering to image-to-image scenarios, as well as more comprehensive databases that encompass a broader range of AI-generated image scenarios. Addressing these issues, we have established a large scale perceptual quality assessment database for both text-to-image and image-to-image AIGIs, named PKU-AIGIQA-4K. We then conduct a well-organized subjective experiment to collect quality labels for AIGIs and perform a comprehensive analysis of the PKU-AIGIQA-4K database. Regarding the use of image prompts during the training process, we propose three image quality assessment (IQA) methods based on pre-trained models that include a no-reference method NR-AIGCIQA, a full-reference method FR-AIGCIQA, and a partial-reference method PR-AIGCIQA. Finally, leveraging the PKU-AIGIQA-4K database, we conduct extensive benchmark experiments and compare the performance of the proposed methods and the current IQA methods.

CVOct 22, 2024
Joint Point Cloud Upsampling and Cleaning with Octree-based CNNs

Jihe Li, Bo Pang, Peng-Shuai Wang

Recovering dense and uniformly distributed point clouds from sparse or noisy data remains a significant challenge. Recently, great progress has been made on these tasks, but usually at the cost of increasingly intricate modules or complicated network architectures, leading to long inference time and huge resource consumption. Instead, we embrace simplicity and present a simple yet efficient method for jointly upsampling and cleaning point clouds. Our method leverages an off-the-shelf octree-based 3D U-Net (OUNet) with minor modifications, enabling the upsampling and cleaning tasks within a single network. Our network directly processes each input point cloud as a whole instead of processing each point cloud patch as in previous works, which significantly eases the implementation and brings at least 47 times faster inference. Extensive experiments demonstrate that our method achieves state-of-the-art performances under huge efficiency advantages on a series of benchmarks. We expect our method to serve simple baselines and inspire researchers to rethink the method design on point cloud upsampling and cleaning.

CVJun 5, 2024
Gaussian Primitives for Deformable Image Registration

Jihe Li, Xiang Liu, Fabian Zhang et al.

Deformable Image Registration (DIR) is essential for aligning medical images that exhibit anatomical variations, facilitating applications such as disease tracking and radiotherapy planning. While classical iterative methods and deep learning approaches have achieved success in DIR, they are often hindered by computational inefficiency or poor generalization. In this paper, we introduce GaussianDIR, a novel, case-specific optimization DIR method inspired by 3D Gaussian splatting. In general, GaussianDIR represents image deformations using a sparse set of mobile and flexible Gaussian primitives, each defined by a center position, covariance, and local rigid transformation. This compact and explicit representation reduces noise and computational overhead while improving interpretability. Furthermore, the movement of individual voxel is derived via blending the local rigid transformation of the neighboring Gaussian primitives. By this, GaussianDIR captures both global smoothness and local rigidity as well as reduces the computational burden. To address varying levels of deformation complexity, GaussianDIR also integrates an adaptive density control mechanism that dynamically adjusts the density of Gaussian primitives. Additionally, we employ multi-scale Gaussian primitives to capture both coarse and fine deformations, reducing optimization to local minima. Experimental results on brain MRI, lung CT, and cardiac MRI datasets demonstrate that GaussianDIR outperforms existing DIR methods in both accuracy and efficiency, highlighting its potential for clinical applications. Finally, as a training-free approach, it challenges the stereotype that iterative methods are inherently slow and transcend the limitations of poor generalization.