Yuhe Wang

CV
9papers
83citations
Novelty54%
AI Score46

9 Papers

CLFeb 5, 2023Code
Meta-Learning Siamese Network for Few-Shot Text Classification

Chengcheng Han, Yuhe Wang, Yingnan Fu et al. · pku

Few-shot learning has been used to tackle the problem of label scarcity in text classification, of which meta-learning based methods have shown to be effective, such as the prototypical networks (PROTO). Despite the success of PROTO, there still exist three main problems: (1) ignore the randomness of the sampled support sets when computing prototype vectors; (2) disregard the importance of labeled samples; (3) construct meta-tasks in a purely random manner. In this paper, we propose a Meta-Learning Siamese Network, namely, Meta-SN, to address these issues. Specifically, instead of computing prototype vectors from the sampled support sets, Meta-SN utilizes external knowledge (e.g. class names and descriptive texts) for class labels, which is encoded as the low-dimensional embeddings of prototype vectors. In addition, Meta-SN presents a novel sampling strategy for constructing meta-tasks, which gives higher sampling probabilities to hard-to-classify samples. Extensive experiments are conducted on six benchmark datasets to show the clear superiority of Meta-SN over other state-of-the-art models. For reproducibility, all the datasets and codes are provided at https://github.com/hccngu/Meta-SN.

CVAug 29, 2024Code
Toward Robust Early Detection of Alzheimer's Disease via an Integrated Multimodal Learning Approach

Yifei Chen, Shenghao Zhu, Zhaojie Fang et al.

Alzheimer's Disease (AD) is a complex neurodegenerative disorder marked by memory loss, executive dysfunction, and personality changes. Early diagnosis is challenging due to subtle symptoms and varied presentations, often leading to misdiagnosis with traditional unimodal diagnostic methods due to their limited scope. This study introduces an advanced multimodal classification model that integrates clinical, cognitive, neuroimaging, and EEG data to enhance diagnostic accuracy. The model incorporates a feature tagger with a tabular data coding architecture and utilizes the TimesBlock module to capture intricate temporal patterns in Electroencephalograms (EEG) data. By employing Cross-modal Attention Aggregation module, the model effectively fuses Magnetic Resonance Imaging (MRI) spatial information with EEG temporal data, significantly improving the distinction between AD, Mild Cognitive Impairment, and Normal Cognition. Simultaneously, we have constructed the first AD classification dataset that includes three modalities: EEG, MRI, and tabular data. Our innovative approach aims to facilitate early diagnosis and intervention, potentially slowing the progression of AD. The source code and our private ADMC dataset are available at https://github.com/JustlfC03/MSTNet.

NAOct 29, 2018
Generalized Multiscale Multicontinuum Model for Fractured Vuggy Carbonate Reservoirs

Min Wang, Siu Wun Cheung, Eric T. Chung et al.

Simulating flow in a highly heterogeneous reservoir with multiscale characteristics could be considerably demanding. To tackle this problem, we propose a numerical scheme coupling the Generalized Multiscale Finite Element Method (GMsFEM) with a triple-continuum model aimed at a faster simulator framework that can explicitly represent the interactions among different continua. To further enrich the descriptive ability of our proposed model, we combine the Discrete Fracture Model (DFM) to model the local effects of discrete fractures. In the proposed model, GMsFEM, as an advanced model reduction technique, enables capturing the multiscale flow dynamics. This is accomplished by systematically generating an approximation space through solving a series of local snapshot and spectral problems. The resulting eigenfunctions can pass the local features to the global level when acting as basis functions in coarse problems. Our goal in this paper is to further improve the accuracy of flow simulation in complicated reservoirs especially for the case when multiple discrete fractures located in single coarse neighborhood and multiscale finite element methods fail. Together with a detailed description of the model, several numerical experiments are conducted to confirm the success of our proposed method. A rigid proof is also given in the aspect of numerical analysis.

76.3NAApr 10
Solving and learning advective multiscale Darcian dynamics with the Neural Basis Method

Yuhe Wang, Min Wang

Physics-governed models are increasingly paired with machine learning for accelerated predictions, yet most "physics--informed" formulations treat the governing equations as a penalty loss whose scale and meaning are set by heuristic balancing. This blurs operator structure, thereby confounding solution approximation error with governing-equation enforcement error and making the solving and learning progress hard to interpret and control. Here we introduce the Neural Basis Method, a projection-based formulation that couples a predefined, physics-conforming neural basis space with an operator-induced residual metric to obtain a well-conditioned deterministic minimization. Stability and reliability then hinge on this metric: the residual is not merely an optimization objective but a computable certificate tied to approximation and enforcement, remaining stable under basis enrichment and yielding reduced coordinates that are learnable across parametric instances. We use advective multiscale Darcian dynamics as a concrete demonstration of this broader point. Our method produce accurate and robust solutions in single solves and enable fast and effective parametric inference with operator learning.

LGSep 4, 2024
Reservoir Static Property Estimation Using Nearest-Neighbor Neural Network

Yuhe Wang

This note presents an approach for estimating the spatial distribution of static properties in reservoir modeling using a nearest-neighbor neural network. The method leverages the strengths of neural networks in approximating complex, non-linear functions, particularly for tasks involving spatial interpolation. It incorporates a nearest-neighbor algorithm to capture local spatial relationships between data points and introduces randomization to quantify the uncertainty inherent in the interpolation process. This approach addresses the limitations of traditional geostatistical methods, such as Inverse Distance Weighting (IDW) and Kriging, which often fail to model the complex non-linear dependencies in reservoir data. By integrating spatial proximity and uncertainty quantification, the proposed method can improve the accuracy of static property predictions like porosity and permeability.

LGAug 28, 2024
A Hybrid Framework for Spatial Interpolation: Merging Data-driven with Domain Knowledge

Cong Zhang, Shuyi Du, Hongqing Song et al.

Estimating spatially distributed information through the interpolation of scattered observation datasets often overlooks the critical role of domain knowledge in understanding spatial dependencies. Additionally, the features of these data sets are typically limited to the spatial coordinates of the scattered observation locations. In this paper, we propose a hybrid framework that integrates data-driven spatial dependency feature extraction with rule-assisted spatial dependency function mapping to augment domain knowledge. We demonstrate the superior performance of our framework in two comparative application scenarios, highlighting its ability to capture more localized spatial features in the reconstructed distribution fields. Furthermore, we underscore its potential to enhance nonlinear estimation capabilities through the application of transformed fuzzy rules and to quantify the inherent uncertainties associated with the observation data sets. Our framework introduces an innovative approach to spatial information estimation by synergistically combining observational data with rule-assisted domain knowledge.

CVJul 29, 2024
ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2

Wenjun Huang, Jiakai Pan, Jiahao Tang et al.

Multimodal Large Language Models (MLLMs) have attracted much attention for their multifunctionality. However, traditional Transformer architectures incur significant overhead due to their secondary computational complexity. To address this issue, we introduce ML-Mamba, a multimodal language model, which utilizes the latest and efficient Mamba-2 model for inference. Mamba-2 is known for its linear scalability and fast processing of long sequences. We replace the Transformer-based backbone with a pre-trained Mamba-2 model and explore methods for integrating 2D visual selective scanning mechanisms into multimodal learning while also trying various visual encoders and Mamba-2 model variants. Our extensive experiments in various multimodal benchmark tests demonstrate the competitive performance of ML-Mamba and highlight the potential of state space models in multimodal tasks. The experimental results show that: (1) we empirically explore how to effectively apply the 2D vision selective scan mechanism for multimodal learning. We propose a novel multimodal connector called the Mamba-2 Scan Connector (MSC), which enhances representational capabilities. (2) ML-Mamba achieves performance comparable to state-of-the-art methods such as TinyLaVA and MobileVLM v2 through its linear sequential modeling while faster inference speed; (3) Compared to multimodal models utilizing Mamba-1, the Mamba-2-based ML-Mamba exhibits superior inference performance and effectiveness.

LGJul 16, 2024
Are Linear Regression Models White Box and Interpretable?

Ahmed M Salih, Yuhe Wang

Explainable artificial intelligence (XAI) is a set of tools and algorithms that applied or embedded to machine learning models to understand and interpret the models. They are recommended especially for complex or advanced models including deep neural network because they are not interpretable from human point of view. On the other hand, simple models including linear regression are easy to implement, has less computational complexity and easy to visualize the output. The common notion in the literature that simple models including linear regression are considered as "white box" because they are more interpretable and easier to understand. This is based on the idea that linear regression models have several favorable outcomes including the effect of the features in the model and whether they affect positively or negatively toward model output. Moreover, uncertainty of the model can be measured or estimated using the confidence interval. However, we argue that this perception is not accurate and linear regression models are not easy to interpret neither easy to understand considering common XAI metrics and possible challenges might face. This includes linearity, local explanation, multicollinearity, covariates, normalization, uncertainty, features contribution and fairness. Consequently, we recommend the so-called simple models should be treated equally to complex models when it comes to explainability and interpretability.

CVNov 27, 2025
IE-SRGS: An Internal-External Knowledge Fusion Framework for High-Fidelity 3D Gaussian Splatting Super-Resolution

Xiang Feng, Tieshi Zhong, Shuo Chang et al.

Reconstructing high-resolution (HR) 3D Gaussian Splatting (3DGS) models from low-resolution (LR) inputs remains challenging due to the lack of fine-grained textures and geometry. Existing methods typically rely on pre-trained 2D super-resolution (2DSR) models to enhance textures, but suffer from 3D Gaussian ambiguity arising from cross-view inconsistencies and domain gaps inherent in 2DSR models. We propose IE-SRGS, a novel 3DGS SR paradigm that addresses this issue by jointly leveraging the complementary strengths of external 2DSR priors and internal 3DGS features. Specifically, we use 2DSR and depth estimation models to generate HR images and depth maps as external knowledge, and employ multi-scale 3DGS models to produce cross-view consistent, domain-adaptive counterparts as internal knowledge. A mask-guided fusion strategy is introduced to integrate these two sources and synergistically exploit their complementary strengths, effectively guiding the 3D Gaussian optimization toward high-fidelity reconstruction. Extensive experiments on both synthetic and real-world benchmarks show that IE-SRGS consistently outperforms state-of-the-art methods in both quantitative accuracy and visual fidelity.