Yue Yu

h-index24

5papers

43citations

Novelty51%

AI Score39

Ranked #78,630 of 194,257 authors (top 40%)#26,606 in CV (top 45%)

5 Papers

2.6CVJun 15, 2022Code

PolyU-BPCoMa: A Dataset and Benchmark Towards Mobile Colorized Mapping Using a Backpack Multisensorial System

Wenzhong Shi, Pengxin Chen, Muyang Wang et al.

Constructing colorized point clouds from mobile laser scanning and images is a fundamental work in surveying and mapping. It is also an essential prerequisite for building digital twins for smart cities. However, existing public datasets are either in relatively small scales or lack accurate geometrical and color ground truth. This paper documents a multisensorial dataset named PolyU-BPCoMA which is distinctively positioned towards mobile colorized mapping. The dataset incorporates resources of 3D LiDAR, spherical imaging, GNSS and IMU on a backpack platform. Color checker boards are pasted in each surveyed area as targets and ground truth data are collected by an advanced terrestrial laser scanner (TLS). 3D geometrical and color information can be recovered in the colorized point clouds produced by the backpack system and the TLS, respectively. Accordingly, we provide an opportunity to benchmark the mapping and colorization accuracy simultaneously for a mobile multisensorial system. The dataset is approximately 800 GB in size covering both indoor and outdoor environments. The dataset and development kits are available at https://github.com/chenpengxin/PolyU-BPCoMa.git.

26.0LGApr 7, 2025

A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization

Wenyuan Xu, Xiaochen Zuo, Chao Xin et al.

Reinforcement Learning from Human Feedback (RLHF) has emerged as a important paradigm for aligning large language models (LLMs) with human preferences during post-training. This framework typically involves two stages: first, training a reward model on human preference data, followed by optimizing the language model using reinforcement learning algorithms. However, current RLHF approaches may constrained by two limitations. First, existing RLHF frameworks often rely on Bradley-Terry models to assign scalar rewards based on pairwise comparisons of individual responses. However, this approach imposes significant challenges on reward model (RM), as the inherent variability in prompt-response pairs across different contexts demands robust calibration capabilities from the RM. Second, reward models are typically initialized from generative foundation models, such as pre-trained or supervised fine-tuned models, despite the fact that reward models perform discriminative tasks, creating a mismatch. This paper introduces Pairwise-RL, a RLHF framework that addresses these challenges through a combination of generative reward modeling and a pairwise proximal policy optimization (PPO) algorithm. Pairwise-RL unifies reward model training and its application during reinforcement learning within a consistent pairwise paradigm, leveraging generative modeling techniques to enhance reward model performance and score calibration. Experimental evaluations demonstrate that Pairwise-RL outperforms traditional RLHF frameworks across both internal evaluation datasets and standard public benchmarks, underscoring its effectiveness in improving alignment and model behavior.

21.7CVApr 4, 2025

RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation

Hanbo Bi, Yingchao Feng, Boyuan Tong et al.

The rapid advancement of foundation models has revolutionized visual representation learning in a self-supervised manner. However, their application in remote sensing (RS) remains constrained by a fundamental gap: existing models predominantly handle single or limited modalities, overlooking the inherently multi-modal nature of RS observations. Optical, synthetic aperture radar (SAR), and multi-spectral data offer complementary insights that significantly reduce the inherent ambiguity and uncertainty in single-source analysis. To bridge this gap, we introduce RingMoE, a unified multi-modal RS foundation model with 14.7 billion parameters, pre-trained on 400 million multi-modal RS images from nine satellites. RingMoE incorporates three key innovations: (1) A hierarchical Mixture-of-Experts (MoE) architecture comprising modal-specialized, collaborative, and shared experts, effectively modeling intra-modal knowledge while capturing cross-modal dependencies to mitigate conflicts between modal representations; (2) Physics-informed self-supervised learning, explicitly embedding sensor-specific radiometric characteristics into the pre-training objectives; (3) Dynamic expert pruning, enabling adaptive model compression from 14.7B to 1B parameters while maintaining performance, facilitating efficient deployment in Earth observation applications. Evaluated across 23 benchmarks spanning six key RS tasks (i.e., classification, detection, segmentation, tracking, change detection, and depth estimation), RingMoE outperforms existing foundation models and sets new SOTAs, demonstrating remarkable adaptability from single-modal to multi-modal scenarios. Beyond theoretical progress, it has been deployed and trialed in multiple sectors, including emergency response, land management, marine sciences, and urban planning.

5.1IVAug 26, 2025

A Machine Learning Approach to Volumetric Computations of Solid Pulmonary Nodules

Yihan Zhou, Haocheng Huang, Yue Yu et al.

Early detection of lung cancer is crucial for effective treatment and relies on accurate volumetric assessment of pulmonary nodules in CT scans. Traditional methods, such as consolidation-to-tumor ratio (CTR) and spherical approximation, are limited by inconsistent estimates due to variability in nodule shape and density. We propose an advanced framework that combines a multi-scale 3D convolutional neural network (CNN) with subtype-specific bias correction for precise volume estimation. The model was trained and evaluated on a dataset of 364 cases from Shanghai Chest Hospital. Our approach achieved a mean absolute deviation of 8.0 percent compared to manual nonlinear regression, with inference times under 20 seconds per scan. This method outperforms existing deep learning and semi-automated pipelines, which typically have errors of 25 to 30 percent and require over 60 seconds for processing. Our results show a reduction in error by over 17 percentage points and a threefold acceleration in processing speed. These advancements offer a highly accurate, efficient, and scalable tool for clinical lung nodule screening and monitoring, with promising potential for improving early lung cancer detection.

2.7CRAug 6, 2019

Threshold Changeable Secret Sharing Scheme and Its Application to Group Authentication

Fuyou Miao, Yue Yu, Keju Meng et al.

Group oriented applications are getting more and more popular in mobile Internet and call for secure and efficient secret sharing (SS) scheme to meet their requirements. A $(t,n)$ threshold SS scheme divides a secret into $n$ shares such that any $t$ or more than $t$ shares can recover the secret while less than $t$ shares cannot. However, an adversary, even without a valid share, may obtain the secret by impersonating a shareholder to recover the secret with $t$ or more legal shareholders. Therefore, this paper uses linear code to propose a threshold changeable secret sharing (TCSS) scheme, in which threshold should increase from $t$ to the exact number of all participants during secret reconstruction. The scheme does not depend on any computational assumption and realizes asymptotically perfect security. Furthermore, based on the proposed TCSS scheme, a group authentication scheme is constructed, which allows a group user to authenticate whether all users are legal group members at once and thus provides efficient and flexible m-to-m authentication for group oriented applications.