LGMar 28, 2023
Transformer and Snowball Graph Convolution Learning for Brain functional network ClassificationJinlong Hu, Yangmin Huang, Shoubin Dong
Advanced deep learning methods, especially graph neural networks (GNNs), are increasingly expected to learn from brain functional network data and predict brain disorders. In this paper, we proposed a novel Transformer and snowball encoding networks (TSEN) for brain functional network classification, which introduced Transformer architecture with graph snowball connection into GNNs for learning whole-graph representation. TSEN combined graph snowball connection with graph Transformer by snowball encoding layers, which enhanced the power to capture multi-scale information and global patterns of brain functional networks. TSEN also introduced snowball graph convolution as position embedding in Transformer structure, which was a simple yet effective method for capturing local patterns naturally. We evaluated the proposed model by two large-scale brain functional network datasets from autism spectrum disorder and major depressive disorder respectively, and the results demonstrated that TSEN outperformed the state-of-the-art GNN models and the graph-transformer based GNN models.
37.7NCMar 14
Fusion Learning from Dynamic Functional Connectivity: Combining the Amplitude and Phase of fMRI Signals to Identify Brain DisordersJinlong Hu, Jiatong Huang, Zijian Cai
Dynamic functional connectivity (dFC) derived from resting-state functional magnetic resonance imaging (fMRI) has been extensively utilized in brain science research. The sliding window correlation (SWC) method is a widely used approach for constructing dFC by computing correlation coefficients between amplitude time series of signals from pairs of brain regions. In this study, we propose an integrated approach that incorporates both amplitude and phase information of fMRI signals to improve the detection of brain disorders. Specifically, we introduce a multi-scale fusion learning framework, namely MSFL, which leverages two complementary dFC features derived from SWC and phase synchronization (PS). Here, SWC captures amplitude correlations, while PS measures phase coherence within dFC. We evaluated the efficacy of MSFL in classifying autism spectrum disorder and major depressive disorder using two publicly available datasets: ABIDE I and REST-meta-MDD, respectively. The results indicate that MSFL significantly outperforms existing comparative models. Moreover, we performed model explanation analysis using the SHAP framework, which showed that both types of dFC features from SWC and PS contribute to detecting brain disorders.
LGJan 26
GCFX: Generative Counterfactual Explanations for Deep Graph Models at the Model LevelJinlong Hu, Jiacheng Liu
Deep graph learning models have demonstrated remarkable capabilities in processing graph-structured data and have been widely applied across various fields. However, their complex internal architectures and lack of transparency make it difficult to explain their decisions, resulting in opaque models that users find hard to understand and trust. In this paper, we explore model-level explanation techniques for deep graph learning models, aiming to provide users with a comprehensive understanding of the models' overall decision-making processes and underlying mechanisms. Specifically, we address the problem of counterfactual explanations for deep graph learning models by introducing a generative model-level counterfactual explanation approach called GCFX, which is based on deep graph generation. This approach generates a set of high-quality counterfactual explanations that reflect the model's global predictive behavior by leveraging an enhanced deep graph generation framework and a global summarization algorithm. GCFX features an architecture that combines dual encoders, structure-aware taggers, and Message Passing Neural Network decoders, enabling it to accurately learn the true latent distribution of input data and generate high-quality, closely related counterfactual examples. Subsequently, a global counterfactual summarization algorithm selects the most representative and comprehensive explanations from numerous candidate counterfactuals, providing broad insights into the model's global predictive patterns. Experiments on a synthetic dataset and several real-world datasets demonstrate that GCFX outperforms existing methods in terms of counterfactual validity and coverage while maintaining low explanation costs, thereby offering crucial support for enhancing the practicality and trustworthiness of global counterfactual explanations.
CLOct 13, 2025
WebRouter: Query-specific Router via Variational Information Bottleneck for Cost-sensitive Web AgentTao Li, Jinlong Hu, Yang Wang et al.
LLM-brained web agents offer powerful capabilities for web automation but face a critical cost-performance trade-off. The challenge is amplified by web agents' inherently complex prompts that include goals, action histories, and environmental states, leading to degraded LLM ensemble performance. To address this, we introduce WebRouter, a novel query-specific router trained from an information-theoretic perspective. Our core contribution is a cost-aware Variational Information Bottleneck (ca-VIB) objective, which learns a compressed representation of the input prompt while explicitly penalizing the expected operational cost. Experiments on five real-world websites from the WebVoyager benchmark show that WebRouter reduces operational costs by a striking 87.8\% compared to a GPT-4o baseline, while incurring only a 3.8\% accuracy drop.
NCMay 2, 2023
BrainNPT: Pre-training of Transformer networks for brain network classificationJinlong Hu, Yangmin Huang, Nan Wang et al.
Deep learning methods have advanced quickly in brain imaging analysis over the past few years, but they are usually restricted by the limited labeled data. Pre-trained model on unlabeled data has presented promising improvement in feature learning in many domains, including natural language processing and computer vision. However, this technique is under-explored in brain network analysis. In this paper, we focused on pre-training methods with Transformer networks to leverage existing unlabeled data for brain functional network classification. First, we proposed a Transformer-based neural network, named as BrainNPT, for brain functional network classification. The proposed method leveraged <cls> token as a classification embedding vector for the Transformer model to effectively capture the representation of brain network. Second, we proposed a pre-training framework for BrainNPT model to leverage unlabeled brain network data to learn the structure information of brain networks. The results of classification experiments demonstrated the BrainNPT model without pre-training achieved the best performance with the state-of-the-art models, and the BrainNPT model with pre-training strongly outperformed the state-of-the-art models. The pre-training BrainNPT model improved 8.75% of accuracy compared with the model without pre-training. We further compared the pre-training strategies, analyzed the influence of the parameters of the model, and interpreted the trained model.
LGJan 22, 2022
A Multi-modal Fusion Framework Based on Multi-task Correlation Learning for Cancer Prognosis PredictionKaiwen Tan, Weixian Huang, Xiaofeng Liu et al.
Morphological attributes from histopathological images and molecular profiles from genomic data are important information to drive diagnosis, prognosis, and therapy of cancers. By integrating these heterogeneous but complementary data, many multi-modal methods are proposed to study the complex mechanisms of cancers, and most of them achieve comparable or better results from previous single-modal methods. However, these multi-modal methods are restricted to a single task (e.g., survival analysis or grade classification), and thus neglect the correlation between different tasks. In this study, we present a multi-modal fusion framework based on multi-task correlation learning (MultiCoFusion) for survival analysis and cancer grade classification, which combines the power of multiple modalities and multiple tasks. Specifically, a pre-trained ResNet-152 and a sparse graph convolutional network (SGCN) are used to learn the representations of histopathological images and mRNA expression data respectively. Then these representations are fused by a fully connected neural network (FCNN), which is also a multi-task shared network. Finally, the results of survival analysis and cancer grade classification output simultaneously. The framework is trained by an alternate scheme. We systematically evaluate our framework using glioma datasets from The Cancer Genome Atlas (TCGA). Results demonstrate that MultiCoFusion learns better representations than traditional feature extraction methods. With the help of multi-task alternating learning, even simple multi-modal concatenation can achieve better performance than other deep learning and traditional methods. Multi-task learning can improve the performance of multiple tasks not just one of them, and it is effective in both single-modal and multi-modal data.
LGOct 19, 2021
AEFE: Automatic Embedded Feature Engineering for Categorical FeaturesZhenyuan Zhong, Jie Yang, Yacong Ma et al.
The challenge of solving data mining problems in e-commerce applications such as recommendation system (RS) and click-through rate (CTR) prediction is how to make inferences by constructing combinatorial features from a large number of categorical features while preserving the interpretability of the method. In this paper, we propose Automatic Embedded Feature Engineering(AEFE), an automatic feature engineering framework for representing categorical features, which consists of various components including custom paradigm feature construction and multiple feature selection. By selecting the potential field pairs intelligently and generating a series of interpretable combinatorial features, our framework can provide a set of unseen generated features for enhancing model performance and then assist data analysts in discovering the feature importance for particular data mining tasks. Furthermore, AEFE is distributed implemented by task-parallelism, data sampling, and searching schema based on Matrix Factorization field combination, to optimize the performance and enhance the efficiency and scalability of the framework. Experiments conducted on some typical e-commerce datasets indicate that our method outperforms the classical machine learning models and state-of-the-art deep learning models.
CVApr 13, 2021
MESD: Exploring Optical Flow Assessment on Edge of Motion Objects with Motion Edge Structure DifferenceBin Liao, Jinlong Hu
The optical flow estimation has been assessed in various applications. In this paper, we propose a novel method named motion edge structure difference(MESD) to assess estimation errors of optical flow fields on edge of motion objects. We implement comparison experiments for MESD by evaluating five representative optical flow algorithms on four popular benchmarks: MPI Sintel, Middlebury, KITTI 2012 and KITTI 2015. Our experimental results demonstrate that MESD can reasonably and discriminatively assess estimation errors of optical flow fields on motion edge. The results indicate that MESD could be a supplementary metric to existing general assessment metrics for evaluating optical flow algorithms in related computer vision applications.
IRFeb 2, 2019
An end-to-end Generative Retrieval Method for Sponsored Search Engine --Decoding Efficiently into a Closed Target DomainYijiang Lian, Zhijie Chen, Jinlong Hu et al.
In this paper, we present a generative retrieval method for sponsored search engine, which uses neural machine translation (NMT) to generate keywords directly from query. This method is completely end-to-end, which skips query rewriting and relevance judging phases in traditional retrieval systems. Different from standard machine translation, the target space in the retrieval setting is a constrained closed set, where only committed keywords should be generated. We present a Trie-based pruning technique in beam search to address this problem. The biggest challenge in deploying this method into a real industrial environment is the latency impact of running the decoder. Self-normalized training coupled with Trie-based dynamic pruning dramatically reduces the inference time, yielding a speedup of more than 20 times. We also devise an mixed online-offline serving architecture to reduce the latency and CPU consumption. To encourage the NMT to generate new keywords uncovered by the existing system, training data is carefully selected. This model has been successfully applied in Baidu's commercial search engine as a supplementary retrieval branch, which has brought a remarkable revenue improvement of more than 10 percents.
IRDec 10, 2018
Top-N-Rank: A Scalable List-wise Ranking Method for Recommender SystemsJunjie Liang, Jinlong Hu, Shoubin Dong et al.
We propose Top-N-Rank, a novel family of list-wise Learning-to-Rank models for reliably recommending the N top-ranked items. The proposed models optimize a variant of the widely used discounted cumulative gain (DCG) objective function which differs from DCG in two important aspects: (i) It limits the evaluation of DCG only on the top N items in the ranked lists, thereby eliminating the impact of low-ranked items on the learned ranking function; and (ii) it incorporates weights that allow the model to leverage multiple types of implicit feedback with differing levels of reliability or trustworthiness. Because the resulting objective function is non-smooth and hence challenging to optimize, we consider two smooth approximations of the objective function, using the traditional sigmoid function and the rectified linear unit (ReLU). We propose a family of learning-to-rank algorithms (Top-N-Rank) that work with any smooth objective function. Then, a more efficient variant, Top-N-Rank.ReLU, is introduced, which effectively exploits the properties of ReLU function to reduce the computational complexity of Top-N-Rank from quadratic to linear in the average number of items rated by users. The results of our experiments using two widely used benchmarks, namely, the MovieLens data set and the Amazon Video Games data set demonstrate that: (i) The `top-N truncation' of the objective function substantially improves the ranking quality of the top N recommendations; (ii) using the ReLU for smoothing the objective function yields significant improvement in both ranking quality as well as runtime as compared to using the sigmoid; and (iii) Top-N-Rank.ReLU substantially outperforms the well-performing list-wise ranking methods in terms of ranking quality.