Peiguang Jing

CV
h-index21
6papers
138citations
Novelty53%
AI Score35

6 Papers

CVAug 6, 2023Code
StyleEDL: Style-Guided High-order Attention Network for Image Emotion Distribution Learning

Peiguang Jing, Xianyi Liu, Ji Wang et al.

Emotion distribution learning has gained increasing attention with the tendency to express emotions through images. As for emotion ambiguity arising from humans' subjectivity, substantial previous methods generally focused on learning appropriate representations from the holistic or significant part of images. However, they rarely consider establishing connections with the stylistic information although it can lead to a better understanding of images. In this paper, we propose a style-guided high-order attention network for image emotion distribution learning termed StyleEDL, which interactively learns stylistic-aware representations of images by exploring the hierarchical stylistic information of visual contents. Specifically, we consider exploring the intra- and inter-layer correlations among GRAM-based stylistic representations, and meanwhile exploit an adversary-constrained high-order attention mechanism to capture potential interactions between subtle visual parts. In addition, we introduce a stylistic graph convolutional network to dynamically generate the content-dependent emotion representations to benefit the final emotion distribution learning. Extensive experiments conducted on several benchmark datasets demonstrate the effectiveness of our proposed StyleEDL compared to state-of-the-art methods. The implementation is released at: https://github.com/liuxianyi/StyleEDL.

IVMar 13, 2023
Deep Learning-based Eye-Tracking Analysis for Diagnosis of Alzheimer's Disease Using 3D Comprehensive Visual Stimuli

Fangyu Zuo, Peiguang Jing, Jinglin Sun et al.

Alzheimer's Disease (AD) causes a continuous decline in memory, thinking, and judgment. Traditional diagnoses are usually based on clinical experience, which is limited by some realistic factors. In this paper, we focus on exploiting deep learning techniques to diagnose AD based on eye-tracking behaviors. Visual attention, as typical eye-tracking behavior, is of great clinical value to detect cognitive abnormalities in AD patients. To better analyze the differences in visual attention between AD patients and normals, we first conduct a 3D comprehensive visual task on a non-invasive eye-tracking system to collect visual attention heatmaps. We then propose a multi-layered comparison convolution neural network (MC-CNN) to distinguish the visual attention differences between AD patients and normals. In MC-CNN, the multi-layered representations of heatmaps are obtained by hierarchical convolution to better encode eye-movement behaviors, which are further integrated into a distance vector to benefit the comprehensive visual task. Extensive experimental results on the collected dataset demonstrate that MC-CNN achieves consistent validity in classifying AD patients and normals with eye-tracking data.

CVDec 18, 2024Code
VIIS: Visible and Infrared Information Synthesis for Severe Low-light Image Enhancement

Chen Zhao, Mengyuan Yu, Fan Yang et al.

Images captured in severe low-light circumstances often suffer from significant information absence. Existing singular modality image enhancement methods struggle to restore image regions lacking valid information. By leveraging light-impervious infrared images, visible and infrared image fusion methods have the potential to reveal information hidden in darkness. However, they primarily emphasize inter-modal complementation but neglect intra-modal enhancement, limiting the perceptual quality of output images. To address these limitations, we propose a novel task, dubbed visible and infrared information synthesis (VIIS), which aims to achieve both information enhancement and fusion of the two modalities. Given the difficulty in obtaining ground truth in the VIIS task, we design an information synthesis pretext task (ISPT) based on image augmentation. We employ a diffusion model as the framework and design a sparse attention-based dual-modalities residual (SADMR) conditioning mechanism to enhance information interaction between the two modalities. This mechanism enables features with prior knowledge from both modalities to adaptively and iteratively attend to each modality's information during the denoising process. Our extensive experiments demonstrate that our model qualitatively and quantitatively outperforms not only the state-of-the-art methods in relevant fields but also the newly designed baselines capable of both information enhancement and fusion. The code is available at https://github.com/Chenz418/VIIS.

CVMar 15, 2024
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's Disease via Jointly Analysis of Visual Stimuli and Eye Movements

Yu Liu, Wenlin Zhang, Shaochu Wang et al.

Early diagnosis of Alzheimer's Disease (AD) is very important for following medical treatments, and eye movements under special visual stimuli may serve as a potential non-invasive biomarker for detecting cognitive abnormalities of AD patients. In this paper, we propose an Depth-induced saliency comparison network (DISCN) for eye movement analysis, which may be used for diagnosis the Alzheimers disease. In DISCN, a salient attention module fuses normal eye movements with RGB and depth maps of visual stimuli using hierarchical salient attention (SAA) to evaluate comprehensive saliency maps, which contain information from both visual stimuli and normal eye movement behaviors. In addition, we introduce serial attention module (SEA) to emphasis the most abnormal eye movement behaviors to reduce personal bias for a more robust result. According to our experiments, the DISCN achieves consistent validity in classifying the eye movements between the AD patients and normal controls.

CVApr 25, 2021
A Novel Binocular Eye-Tracking SystemWith Stereo Stimuli for 3D Gaze Estimation

Jinglin Sun, Zhipeng Wu, Han Wang et al.

Eye-tracking technologies have been widely used in applications like psychological studies and human computer interactions (HCI). However, most current eye trackers focus on 2D point of gaze (PoG) estimation and cannot provide accurate gaze depth.Concerning future applications such as HCI with 3D displays, we propose a novel binocular eye tracking device with stereo stimuli to provide highly accurate 3D PoG estimation. In our device, the 3D stereo imaging system can provide users with a friendly and immersive 3D visual experience without wearing any accessories. The eye capturing system can directly record the users eye movements under 3D stimuli without disturbance. A regression based 3D eye tracking model is built based on collected eye movement data under stereo stimuli. Our model estimates users 2D gaze with features defined by eye region landmarks and further estimates 3D PoG with a multi source feature set constructed by comprehensive eye movement features and disparity features from stereo stimuli. Two test stereo scenes with different depths of field are designed to verify the model effectiveness. Experimental results show that the average error for 2D gaze estimation was 0.66\degree and for 3D PoG estimation, the average errors are 1.85~cm/0.15~m over the workspace volume 50~cm $\times$ 30~cm $\times$ 75~cm/2.4~m $\times$ 4.0~m $\times$ 7.9~m separately.

IRDec 11, 2020
Market2Dish: Health-aware Food Recommendation

Wenjie Wang, Ling-yu Duan, Hao Jiang et al.

With the rising incidence of some diseases, such as obesity and diabetes, a healthy diet is arousing increasing attention. However, most existing food-related research efforts focus on recipe retrieval, user preference-based food recommendation, cooking assistance, or the nutrition and calorie estimation of dishes, ignoring the personalized health-aware food recommendation. Therefore, in this work, we present a personalized health-aware food recommendation scheme, namely Market2Dish, mapping the ingredients displayed in the market to the healthy dishes eaten at home. The proposed scheme comprises three components, namely recipe retrieval, user-health profiling, and health-aware food recommendation. In particular, recipe retrieval aims to acquire the ingredients available to the users, and then retrieve recipe candidates from a large-scale recipe dataset. User health profiling is to characterize the health conditions of users by capturing the textual health-related information crawled from social networks. Specifically, to solve the issue that the health-related information is extremely sparse, we incorporate a word-class interaction mechanism into the proposed deep model to learn the fine-grained correlations between the textual tweets and pre-defined health concepts. For the health-aware food recommendation, we present a novel category-aware hierarchical memory network-based recommender to learn the health-aware user-recipe interactions for better food recommendation. Moreover, extensive experiments demonstrate the effectiveness of the health-aware food recommendation scheme.