Yibing Chen

CL
h-index132
7papers
5citations
Novelty40%
AI Score46

7 Papers

96.7NAMar 25
A High-Order Finite Volume GENO Scheme with Implicit Time Integration for Three-Temperature Radiation Diffusion Equations

Fengxiang Zhao, Yaqing Yang, Yibing Chen et al.

This study presents a high-order finite volume scheme capable of large time-step integration for three-temperature radiation diffusion (3TRD) equations, where conservation is naturally achieved through energy update. To handle local large gradients and discontinuities in temperature, a central generalized ENO (GENO) reconstruction is developed for diffusion systems, which achieves essentially non-oscillatory reconstruction for discontinuous solutions. Compared to conventional nonlinear reconstruction methods, its most distinctive feature is the central-type symmetric sub-stencils, which ensure consistency between the numerics and the isotropic nature of thermal diffusion. Additionally, the central GENO method provides smooth states of temperature and temperature gradient at interfaces, facilitating the evaluation of numerical fluxes. Furthermore, interface flux evaluation for cases with discontinuous physical property parameters is modeled. To address the extremely small time-step issue caused by stiff diffusion and source terms, a dual-time-stepping method based on implicit time discretization is developed for the first time in 3TRD systems, with the advantage of decoupling temporal discretization from complex nonlinear spatial discretization. A series of numerical examples validates the high accuracy, physical property preservation, strong robustness, and large time-step integration capability of the present high-order central GENO scheme.

51.9CLApr 10Code
MAB-DQA: Addressing Query Aspect Importance in Document Question Answering with Multi-Armed Bandits

Yixin Xiang, Yunshan Ma, Xiaoyu Du et al.

Document Question Answering (DQA) involves generating answers from a document based on a user's query, representing a key task in document understanding. This task requires interpreting visual layouts, which has prompted recent studies to adopt multimodal Retrieval-Augmented Generation (RAG) that processes page images for answer generation. However, in multimodal RAG, visual DQA struggles to utilize a large number of images effectively, as the retrieval stage often retains only a few candidate pages (e.g., Top-4), causing informative but less visually salient content to be overlooked in favor of common yet low-information pages. To address this issue, we propose a Multi-Armed Bandit-based DQA framework (MAB-DQA) to explicitly model the varying importance of multiple implicit aspects in a query. Specifically, MAB-DQA decomposes a query into aspect-aware subqueries and retrieves an aspect-specific candidate set for each. It treats each subquery as an arm and uses preliminary reasoning results from a small number of representative pages as reward signals to estimate aspect utility. Guided by an exploration-exploitation policy, MAB-DQA dynamically reallocates retrieval budgets toward high-value aspects. With the most informative pages and their correlations, MAB-DQA generates the expected results. On four benchmarks, MAB-DQA shows an average improvement of 5%-18% over the state-of-the-art method, consistently enhancing document understanding. Code at https://github.com/ElephantOH/MAB-DQA.

CVJan 21, 2025Code
Are Traditional Deep Learning Model Approaches as Effective as a Retinal-Specific Foundation Model for Ocular and Systemic Disease Detection?

Samantha Min Er Yew, Xiaofeng Lei, Jocelyn Hui Lin Goh et al.

Background: RETFound, a self-supervised, retina-specific foundation model (FM), showed potential in downstream applications. However, its comparative performance with traditional deep learning (DL) models remains incompletely understood. This study aimed to evaluate RETFound against three ImageNet-pretrained supervised DL models (ResNet50, ViT-base, SwinV2) in detecting ocular and systemic diseases. Methods: We fine-tuned/trained RETFound and three DL models on full datasets, 50%, 20%, and fixed sample sizes (400, 200, 100 images, with half comprising disease cases; for each DR severity class, 100 and 50 cases were used. Fine-tuned models were tested internally using the SEED (53,090 images) and APTOS-2019 (3,672 images) datasets and externally validated on population-based (BES, CIEMS, SP2, UKBB) and open-source datasets (ODIR-5k, PAPILA, GAMMA, IDRiD, MESSIDOR-2). Model performance was compared using area under the receiver operating characteristic curve (AUC) and Z-tests with Bonferroni correction (P<0.05/3). Interpretation: Traditional DL models are mostly comparable to RETFound for ocular disease detection with large datasets. However, RETFound is superior in systemic disease detection with smaller datasets. These findings offer valuable insights into the respective merits and limitation of traditional models and FMs.

GROct 7, 2025
SpotDiff: Spotting and Disentangling Interference in Feature Space for Subject-Preserving Image Generation

Yongzhi Li, Saining Zhang, Yibing Chen et al.

Personalized image generation aims to faithfully preserve a reference subject's identity while adapting to diverse text prompts. Existing optimization-based methods ensure high fidelity but are computationally expensive, while learning-based approaches offer efficiency at the cost of entangled representations influenced by nuisance factors. We introduce SpotDiff, a novel learning-based method that extracts subject-specific features by spotting and disentangling interference. Leveraging a pre-trained CLIP image encoder and specialized expert networks for pose and background, SpotDiff isolates subject identity through orthogonality constraints in the feature space. To enable principled training, we introduce SpotDiff10k, a curated dataset with consistent pose and background variations. Experiments demonstrate that SpotDiff achieves more robust subject preservation and controllable editing than prior methods, while attaining competitive performance with only 10k training samples.

CLJul 21, 2025
BEnchmarking LLMs for Ophthalmology (BELO) for Ophthalmological Knowledge and Reasoning

Sahana Srinivasan, Xuguang Ai, Thaddaeus Wai Soon Lo et al.

Current benchmarks evaluating large language models (LLMs) in ophthalmology are limited in scope and disproportionately prioritise accuracy. We introduce BELO (BEnchmarking LLMs for Ophthalmology), a standardized and comprehensive evaluation benchmark developed through multiple rounds of expert checking by 13 ophthalmologists. BELO assesses ophthalmology-related clinical accuracy and reasoning quality. Using keyword matching and a fine-tuned PubMedBERT model, we curated ophthalmology-specific multiple-choice-questions (MCQs) from diverse medical datasets (BCSC, MedMCQA, MedQA, BioASQ, and PubMedQA). The dataset underwent multiple rounds of expert checking. Duplicate and substandard questions were systematically removed. Ten ophthalmologists refined the explanations of each MCQ's correct answer. This was further adjudicated by three senior ophthalmologists. To illustrate BELO's utility, we evaluated six LLMs (OpenAI o1, o3-mini, GPT-4o, DeepSeek-R1, Llama-3-8B, and Gemini 1.5 Pro) using accuracy, macro-F1, and five text-generation metrics (ROUGE-L, BERTScore, BARTScore, METEOR, and AlignScore). In a further evaluation involving human experts, two ophthalmologists qualitatively reviewed 50 randomly selected outputs for accuracy, comprehensiveness, and completeness. BELO consists of 900 high-quality, expert-reviewed questions aggregated from five sources: BCSC (260), BioASQ (10), MedMCQA (572), MedQA (40), and PubMedQA (18). A public leaderboard has been established to promote transparent evaluation and reporting. Importantly, the BELO dataset will remain a hold-out, evaluation-only benchmark to ensure fair and reproducible comparisons of future models.

CVMar 25, 2021
Generative-Adversarial-Networks-based Ghost Recognition

Yuchen He, Yibing Chen, Sheng Luo et al.

Nowadays, target recognition technique plays an important role in many fields. However, the current target image information based methods suffer from the influence of image quality and the time cost of image reconstruction. In this paper, we propose a novel imaging-free target recognition method combining ghost imaging (GI) and generative adversarial networks (GAN). Based on the mechanism of GI, a set of random speckles sequence is employed to illuminate target, and a bucket detector without resolution is utilized to receive echo signal. The bucket signal sequence formed after continuous detections is constructed into a bucket signal array, which is regarded as the sample of GAN. Then, conditional GAN is used to map bucket signal array and target category. In practical application, the speckles sequence in training step is employed to illuminate target, and the bucket signal array is input GAN for recognition. The proposed method can improve the problems caused by conventional recognition methods that based on target image information, and provide a certain turbulence-free ability. Extensive experiments show that the proposed method achieves promising performance.

NAAug 22, 2015
HFVS: An Arbitrary High Order Flux Vector Splitting Method

Yibing Chen, Song Jiang, Na Liu

In this paper, a new scheme of arbitrary high order accuracy in both space and time is proposed to solve hyperbolic conservative laws. Based on the idea of flux vector splitting(FVS) scheme, we split all the space and time derivatives in the Taylor expansion of the numerical flux into two parts: one part with positive eigenvalues, another part with negative eigenvalues. According to a Lax-Wendroff procedure, all the time derivatives are then replaced by space derivatives. And the space derivatives is calculated by WENO reconstruction polynomial. One of the most important advantages of this new scheme is easy to implement.In addition, it should be pointed out, the procedure of calculating the space and time derivatives in numerical flux can be used as a building block to extend the current first order schemes to very high order accuracy in both space and time. Numerous numerical tests for linear and nonlinear hyperbolic conservative laws demonstrate that new scheme is robust and can be high order accuracy in both space and time.