OCAug 4, 2022
Memetic algorithms for Spatial Partitioning problemsSubhodip Biswas, Fanglan Chen, Zhiqian Chen et al.
Spatial optimization problems (SOPs) are characterized by spatial relationships governing the decision variables, objectives, and/or constraint functions. In this article, we focus on a specific type of SOP called spatial partitioning, which is a combinatorial problem due to the presence of discrete spatial units. Exact optimization methods do not scale with the size of the problem, especially within practicable time limits. This motivated us to develop population-based metaheuristics for solving such SOPs. However, the search operators employed by these population-based methods are mostly designed for real-parameter continuous optimization problems. For adapting these methods to SOPs, we apply domain knowledge in designing spatially-aware search operators for efficiently searching through the discrete search space while preserving the spatial constraints. To this end, we put forward a simple yet effective algorithm called swarm-based spatial memetic algorithm (SPATIAL) and test it on the school (re)districting problem. Detailed experimental investigations are performed on real-world datasets to evaluate the performance of SPATIAL. Besides, ablation studies are performed to understand the role of the individual components of SPATIAL. Additionally, we discuss how SPATIAL~is helpful in the real-life planning process and its applicability to different scenarios and motivate future research directions.
SIAug 7, 2023Code
XFlow: Benchmarking Flow Behaviors over GraphsZijian Zhang, Zonghan Zhang, Zhiqian Chen
The occurrence of diffusion on a graph is a prevalent and significant phenomenon, as evidenced by the spread of rumors, influenza-like viruses, smart grid failures, and similar events. Comprehending the behaviors of flow is a formidable task, due to the intricate interplay between the distribution of seeds that initiate flow propagation, the propagation model, and the topology of the graph. The study of networks encompasses a diverse range of academic disciplines, including mathematics, physics, social science, and computer science. This interdisciplinary nature of network research is characterized by a high degree of specialization and compartmentalization, and the cooperation facilitated by them is inadequate. From a machine learning standpoint, there is a deficiency in a cohesive platform for assessing algorithms across various domains. One of the primary obstacles to current research in this field is the absence of a comprehensive curated benchmark suite to study the flow behaviors under network scenarios. To address this disparity, we propose the implementation of a novel benchmark suite that encompasses a variety of tasks, baseline models, graph datasets, and evaluation tools. In addition, we present a comprehensive analytical framework that offers a generalized approach to numerous flow-related tasks across diverse domains, serving as a blueprint and roadmap. Drawing upon the outcomes of our empirical investigation, we analyze the advantages and disadvantages of current foundational models, and we underscore potential avenues for further study. The datasets, code, and baseline models have been made available for the public at: https://github.com/XGraphing/XFlow
AIJun 8, 2022
Sampling-based techniques for designing school boundariesSubhodip Biswas, Fanglan Chen, Zhiqian Chen et al.
Recently, an increasing number of researchers, especially in the realm of political redistricting, have proposed sampling-based techniques to generate a subset of plans from the vast space of districting plans. These techniques have been increasingly adopted by U.S. courts of law and independent commissions as a tool for identifying partisan gerrymanders. Motivated by these recent developments, we develop a set of similar sampling techniques for designing school boundaries based on the flip proposal. Note that the flip proposal here refers to the change in the districting plan by a single assignment. These sampling-based techniques serve a dual purpose. They can be used as a baseline for comparing redistricting algorithms based on local search. Additionally, these techniques can help to infer the problem characteristics that may be further used for developing efficient redistricting methods. We empirically touch on both these aspects in regards to the problem of school redistricting.
LGNov 27, 2022
Deep representation learning: Fundamentals, Perspectives, Applications, and Open ChallengesKourosh T. Baghaei, Amirreza Payandeh, Pooya Fayyazsanavi et al.
Machine Learning algorithms have had a profound impact on the field of computer science over the past few decades. These algorithms performance is greatly influenced by the representations that are derived from the data in the learning process. The representations learned in a successful learning process should be concise, discrete, meaningful, and able to be applied across a variety of tasks. A recent effort has been directed toward developing Deep Learning models, which have proven to be particularly effective at capturing high-dimensional, non-linear, and multi-modal characteristics. In this work, we discuss the principles and developments that have been made in the process of learning representations, and converting them into desirable applications. In addition, for each framework or model, the key issues and open challenges, as well as the advantages, are examined.
CLMay 20
LamPO: A Lambda Style Policy Optimization for Reasoning Language ModelsZhe Yuan, Yipeng Zhou, Jinghan Li et al.
Reinforcement learning with verifiable rewards (RLVR) has become an effective paradigm for improving reasoning language models on tasks such as mathematics, coding, and scientific question answering. However, widely used group-relative objectives, such as GRPO, summarize each sampled group with scalar statistics and therefore discard fine-grained relational information among candidate responses. This weakens credit assignment under sparse outcome rewards, especially when multiple generated solutions differ only subtly in reasoning quality. We propose \textbf{LamPO}, a \textbf{Lambda-Style Policy Optimization} method that replaces scalar group advantages with a \emph{Pairwise Decomposed Advantage}. LamPO aggregates pairwise reward gaps within each response group and modulates each comparison by a confidence-aware weight computed from sequence log-probability differences, while retaining the critic-free and clipped-update structure of PPO-style optimization. When reference solutions are available, we further add a lightweight ROUGE-L-based dense auxiliary reward to reduce reward sparsity. Experiments on AIME24, AIME25, MATH-500, and GPQA-Diamond with Qwen3-1.7B, Qwen3-4B, and Phi-4-mini show that LamPO consistently improves over GRPO and recent RLVR variants, with more stable training dynamics and better sample efficiency.
CLMay 19
LambdaPO: A Lambda Style Policy Optimization for Reasoning Language ModelsZhe Yuan, Yipeng Zhou, Jinghan Li et al.
Group Relative Policy Optimization(GRPO) has become a cornerstone of modern reinforcement learning alignment, prized for its efficacy in foregoing an explicit value-critic by leveraging reward normalization across sampled trajectory cohorts. However, the method's reliance on a monolithic statistical baseline, such as the group mean, collapses the relational topology of the trajectory space into a single scalar, thereby erasing the fine-grained preference information essential for navigating complex, rank-sensitive reward landscapes. To address this issue, we introduce a novel framework, Lambda Policy Optimization (LambdaPO), that addresses this information-theoretic bottleneck by re-conceptualizing advantage estimation from a scalar value to a decomposed, pairwise preference structure. Specifically, the advantage for any given trajectory is formulated as the integrated sum of reward differentials against all peers in its cohort, where each pairwise comparison is dynamically attenuated by the policy's own probabilistic confidence in the established preference. To further mitigate the sparsity of binary outcome supervision, we augment the objective with a semantic density reward, derived from the precision-recall alignment between generated reasoning traces and ground-truth solutions. As a result, our method can mine more fine-grained optimization signals from a group of rollouts, guiding the LLM to a better optima. Experimental results across challenging math reasoning and question-answering tasks demonstrates that LambdaPO improves performance compared to the baseline methods.
IRApr 20
MasterSet: A Large-Scale Benchmark for Must-Cite Citation Recommendation in the AI/ML LiteratureMd Toyaha Rahman Ratul, Zhiqian Chen, Kaiqun Fu et al.
The explosive growth of AI and machine learning literature -- with venues like NeurIPS and ICLR now accepting thousands of papers annually -- has made comprehensive citation coverage increasingly difficult for researchers. While citation recommendation has been studied for over a decade, existing systems primarily focus on broad relevance rather than identifying the critical set of ``must-cite'' papers: direct experimental baselines, foundational methods, and core dependencies whose omission would misrepresent a contribution's novelty or undermine reproducibility. We introduce MasterSet, a large-scale benchmark specifically designed to evaluate must-cite recommendation in the AI/ML domain. MasterSet incorporates over 150,000 papers collected from official conference proceedings/websites of 15 leading venues, serving as a comprehensive candidate pool for retrieval. We annotate citations with a three-tier labeling scheme: (I) experimental baseline status, (II) core relevance (1--5 scale), and (III) intra-paper mention frequency. Our annotation pipeline leverages an LLM-based judge, validated by human experts on a stratified sample. The benchmark task requires retrieving must-cite papers from the candidate pool given only a query paper's title and abstract, evaluated by Recall@$K$. We establish baselines using sparse retrieval, dense scientific embeddings, and graph-based methods, demonstrating that must-cite retrieval remains a challenging open problem.
SIMar 25, 2024Code
Graph Bayesian Optimization for Multiplex Influence MaximizationZirui Yuan, Minglai Shao, Zhiqian Chen
Influence maximization (IM) is the problem of identifying a limited number of initial influential users within a social network to maximize the number of influenced users. However, previous research has mostly focused on individual information propagation, neglecting the simultaneous and interactive dissemination of multiple information items. In reality, when users encounter a piece of information, such as a smartphone product, they often associate it with related products in their minds, such as earphones or computers from the same brand. Additionally, information platforms frequently recommend related content to users, amplifying this cascading effect and leading to multiplex influence diffusion. This paper first formulates the Multiplex Influence Maximization (Multi-IM) problem using multiplex diffusion models with an information association mechanism. In this problem, the seed set is a combination of influential users and information. To effectively manage the combinatorial complexity, we propose Graph Bayesian Optimization for Multi-IM (GBIM). The multiplex diffusion process is thoroughly investigated using a highly effective global kernelized attention message-passing module. This module, in conjunction with Bayesian linear regression (BLR), produces a scalable surrogate model. A data acquisition module incorporating the exploration-exploitation trade-off is developed to optimize the seed set further. Extensive experiments on synthetic and real-world datasets have proven our proposed framework effective. The code is available at https://github.com/zirui-yuan/GBIM.
SIJul 16, 2022
Understanding Influence Maximization via Higher-Order DecompositionZonghan Zhang, Zhiqian Chen
Given its vast application on online social networks, Influence Maximization (IM) has garnered considerable attention over the last couple of decades. Due to the intricacy of IM, most current research concentrates on estimating the first-order contribution of the nodes to select a seed set, disregarding the higher-order interplay between different seeds. Consequently, the actual influence spread frequently deviates from expectations, and it remains unclear how the seed set quantitatively contributes to this deviation. To address this deficiency, this work dissects the influence exerted on individual seeds and their higher-order interactions utilizing the Sobol index, a variance-based sensitivity analysis. To adapt to IM contexts, seed selection is phrased as binary variables and split into distributions of varying orders. Based on our analysis with various Sobol indices, an IM algorithm dubbed SIM is proposed to improve the performance of current IM algorithms by over-selecting nodes followed by strategic pruning. A case study is carried out to demonstrate that the explanation of the impact effect can dependably identify the key higher-order interactions among seeds. SIM is empirically proved to be superior in effectiveness and competitive in efficiency by experiments on synthetic and real-world graphs.
LGMar 25, 2024Code
Multiple-Source Localization from a Single-Snapshot Observation Using Graph Bayesian OptimizationZonghan Zhang, Zijian Zhang, Zhiqian Chen
Due to the significance of its various applications, source localization has garnered considerable attention as one of the most important means to confront diffusion hazards. Multi-source localization from a single-snapshot observation is especially relevant due to its prevalence. However, the inherent complexities of this problem, such as limited information, interactions among sources, and dependence on diffusion models, pose challenges to resolution. Current methods typically utilize heuristics and greedy selection, and they are usually bonded with one diffusion model. Consequently, their effectiveness is constrained. To address these limitations, we propose a simulation-based method termed BOSouL. Bayesian optimization (BO) is adopted to approximate the results for its sample efficiency. A surrogate function models uncertainty from the limited information. It takes sets of nodes as the input instead of individual nodes. BOSouL can incorporate any diffusion model in the data acquisition process through simulations. Empirical studies demonstrate that its performance is robust across graph structures and diffusion models. The code is available at https://github.com/XGraph-Team/BOSouL.
AIMay 26, 2024
Network Interdiction Goes NeuralLei Zhang, Zhiqian Chen, Chang-Tien Lu et al.
Network interdiction problems are combinatorial optimization problems involving two players: one aims to solve an optimization problem on a network, while the other seeks to modify the network to thwart the first player's objectives. Such problems typically emerge in an attacker-defender context, encompassing areas such as military operations, disease spread analysis, and communication network management. The primary bottleneck in network interdiction arises from the high time complexity of using conventional exact solvers and the challenges associated with devising efficient heuristic solvers. GNNs, recognized as a cutting-edge methodology, have shown significant effectiveness in addressing single-level CO problems on graphs, such as the traveling salesman problem, graph matching, and graph edit distance. Nevertheless, network interdiction presents a bi-level optimization challenge, which current GNNs find difficult to manage. To address this gap, we represent network interdiction problems as Mixed-Integer Linear Programming (MILP) instances, then apply a multipartite GNN with sufficient representational capacity to learn these formulations. This approach ensures that our neural network is more compatible with the mathematical algorithms designed to solve network interdiction problems, resulting in improved generalization. Through two distinct tasks, we demonstrate that our proposed method outperforms theoretical baseline models and provides advantages over traditional exact solvers.
LGJul 18, 2022
Demystifying Graph Convolution with a Simple ConcatenationZhiqian Chen, Zonghan Zhang
Graph convolution (GConv) is a widely used technique that has been demonstrated to be extremely effective for graph learning applications, most notably node categorization. On the other hand, many GConv-based models do not quantify the effect of graph topology and node features on performance, and are even surpassed by some models that do not consider graph structure or node properties. We quantify the information overlap between graph topology, node features, and labels in order to determine graph convolution's representation power in the node classification task. In this work, we first determine the linear separability of graph convoluted features using analysis of variance. Mutual information is used to acquire a better understanding of the possible non-linear relationship between graph topology, node features, and labels. Our theoretical analysis demonstrates that a simple and efficient graph operation that concatenates only graph topology and node properties consistently outperforms conventional graph convolution, especially in the heterophily case. Extensive empirical research utilizing a synthetic dataset and real-world benchmarks demonstrates that graph concatenation is a simple but more flexible alternative to graph convolution.
LGAug 10, 2025
MOTGNN: Interpretable Graph Neural Networks for Multi-Omics Disease ClassificationTiantian Yang, Zhiqian Chen
Integrating multi-omics data, such as DNA methylation, mRNA expression, and microRNA (miRNA) expression, offers a comprehensive view of the biological mechanisms underlying disease. However, the high dimensionality and complex interactions among omics layers present major challenges for predictive modeling. We propose Multi-Omics integration with Tree-generated Graph Neural Network (MOTGNN), a novel and interpretable framework for binary disease classification. MOTGNN employs eXtreme Gradient Boosting (XGBoost) to perform omics-specific supervised graph construction, followed by modality-specific Graph Neural Networks (GNNs) for hierarchical representation learning, and a deep feedforward network for cross-omics integration. On three real-world disease datasets, MOTGNN outperforms state-of-the-art baselines by 5-10% in accuracy, ROC-AUC, and F1-score, and remains robust to severe class imbalance (e.g., 87.2% vs. 33.4% F1 on imbalanced data). The model maintains computational efficiency through sparse graphs (2.1-2.8 edges per node) and provides built-in interpretability, revealing both top-ranked biomarkers and the relative contributions of each omics modality. These results highlight MOTGNN's potential to improve both predictive accuracy and interpretability in multi-omics disease modeling.
LGFeb 28, 2024
Towards Interpreting Multi-Objective Feature AssociationsNisha Pillai, Ganga Gireesan, Michael J. Rothrock et al.
Understanding how multiple features are associated and contribute to a specific objective is as important as understanding how each feature contributes to a particular outcome. Interpretability of a single feature in a prediction may be handled in multiple ways; however, in a multi-objective prediction, it is difficult to obtain interpretability of a combination of feature values. To address this issue, we propose an objective specific feature interaction design using multi-labels to find the optimal combination of features in agricultural settings. One of the novel aspects of this design is the identification of a method that integrates feature explanations with global sensitivity analysis in order to ensure combinatorial optimization in multi-objective settings. We have demonstrated in our preliminary experiments that an approximate combination of feature values can be found to achieve the desired outcome using two agricultural datasets: one with pre-harvest poultry farm practices for multi-drug resistance presence, and one with post-harvest poultry farm practices for food-borne pathogens. In our combinatorial optimization approach, all three pathogens are taken into consideration simultaneously to account for the interaction between conditions that favor different types of pathogen growth. These results indicate that explanation-based approaches are capable of identifying combinations of features that reduce pathogen presence in fewer iterations than a baseline.
LGFeb 28, 2024
Deep Sensitivity Analysis for Objective-Oriented Combinatorial OptimizationGanga Gireesan, Nisha Pillai, Michael J Rothrock et al.
Pathogen control is a critical aspect of modern poultry farming, providing important benefits for both public health and productivity. Effective poultry management measures to reduce pathogen levels in poultry flocks promote food safety by lowering risks of food-borne illnesses. They also support animal health and welfare by preventing infectious diseases that can rapidly spread and impact flock growth, egg production, and overall health. This study frames the search for optimal management practices that minimize the presence of multiple pathogens as a combinatorial optimization problem. Specifically, we model the various possible combinations of management settings as a solution space that can be efficiently explored to identify configurations that optimally reduce pathogen levels. This design incorporates a neural network feedback-based method that combines feature explanations with global sensitivity analysis to ensure combinatorial optimization in multiobjective settings. Our preliminary experiments have promising results when applied to two real-world agricultural datasets. While further validation is still needed, these early experimental findings demonstrate the potential of the model to derive targeted feature interactions that adaptively optimize pathogen control under varying real-world constraints.
QMSep 19, 2025
TF-DWGNet: A Directed Weighted Graph Neural Network with Tensor Fusion for Multi-Omics Cancer Subtype ClassificationTiantian Yang, Zhiqian Chen
Integration and analysis of multi-omics data provide valuable insights for cancer subtype classification. However, such data are inherently heterogeneous, high-dimensional, and exhibit complex intra- and inter-modality dependencies. Recent advances in graph neural networks (GNNs) offer powerful tools for modeling such structure. Yet, most existing methods rely on prior knowledge or predefined similarity networks to construct graphs, which are often undirected or unweighted, failing to capture the directionality and strength of biological interactions. Interpretability at both the modality and feature levels also remains limited. To address these challenges, we propose TF-DWGNet, a novel Graph Neural Network framework that combines tree-based Directed Weighted graph construction with Tensor Fusion for multiclass cancer subtype classification. TF-DWGNet introduces two key innovations: a supervised tree-based approach for constructing directed, weighted graphs tailored to each omics modality, and a tensor fusion mechanism that captures unimodal, bimodal, and trimodal interactions using low-rank decomposition for efficiency. TF-DWGNet enables modality-specific representation learning, joint embedding fusion, and interpretable subtype prediction. Experiments on real-world cancer datasets show that TF-DWGNet consistently outperforms state-of-the-art baselines across multiple metrics and statistical tests. Moreover, it provides biologically meaningful insights by ranking influential features and modalities. These results highlight TF-DWGNet's potential for effective and interpretable multi-omics integration in cancer research.
LGJul 23, 2025
Filter-And-Refine: A MLLM Based Cascade System for Industrial-Scale Video Content ModerationZixuan Wang, Jinghao Shi, Hanzhong Liang et al.
Effective content moderation is essential for video platforms to safeguard user experience and uphold community standards. While traditional video classification models effectively handle well-defined moderation tasks, they struggle with complicated scenarios such as implicit harmful content and contextual ambiguity. Multimodal large language models (MLLMs) offer a promising solution to these limitations with their superior cross-modal reasoning and contextual understanding. However, two key challenges hinder their industrial adoption. First, the high computational cost of MLLMs makes full-scale deployment impractical. Second, adapting generative models for discriminative classification remains an open research problem. In this paper, we first introduce an efficient method to transform a generative MLLM into a multimodal classifier using minimal discriminative training data. To enable industry-scale deployment, we then propose a router-ranking cascade system that integrates MLLMs with a lightweight router model. Offline experiments demonstrate that our MLLM-based approach improves F1 score by 66.50% over traditional classifiers while requiring only 2% of the fine-tuning data. Online evaluations show that our system increases automatic content moderation volume by 41%, while the cascading deployment reduces computational cost to only 1.5% of direct full-scale deployment.
IRJun 30, 2025
Embedding-based Retrieval in Multimodal Content ModerationHanzhong Liang, Jinghao Shi, Xiang Shen et al.
Video understanding plays a fundamental role for content moderation on short video platforms, enabling the detection of inappropriate content. While classification remains the dominant approach for content moderation, it often struggles in scenarios requiring rapid and cost-efficient responses, such as trend adaptation and urgent escalations. To address this issue, we introduce an Embedding-Based Retrieval (EBR) method designed to complement traditional classification approaches. We first leverage a Supervised Contrastive Learning (SCL) framework to train a suite of foundation embedding models, including both single-modal and multi-modal architectures. Our models demonstrate superior performance over established contrastive learning methods such as CLIP and MoCo. Building on these embedding models, we design and implement the embedding-based retrieval system that integrates embedding generation and video retrieval to enable efficient and effective trend handling. Comprehensive offline experiments on 25 diverse emerging trends show that EBR improves ROC-AUC from 0.85 to 0.99 and PR-AUC from 0.35 to 0.95. Further online experiments reveal that EBR increases action rates by 10.32% and reduces operational costs by over 80%, while also enhancing interpretability and flexibility compared to classification-based solutions.
QMApr 29, 2024
Bayesian-Guided Generation of Synthetic Microbiomes with Minimized PathogenicityNisha Pillai, Bindu Nanduri, Michael J Rothrock et al.
Synthetic microbiomes offer new possibilities for modulating microbiota, to address the barriers in multidtug resistance (MDR) research. We present a Bayesian optimization approach to enable efficient searching over the space of synthetic microbiome variants to identify candidates predictive of reduced MDR. Microbiome datasets were encoded into a low-dimensional latent space using autoencoders. Sampling from this space allowed generation of synthetic microbiome signatures. Bayesian optimization was then implemented to select variants for biological screening to maximize identification of designs with restricted MDR pathogens based on minimal samples. Four acquisition functions were evaluated: expected improvement, upper confidence bound, Thompson sampling, and probability of improvement. Based on each strategy, synthetic samples were prioritized according to their MDR detection. Expected improvement, upper confidence bound, and probability of improvement consistently produced synthetic microbiome candidates with significantly fewer searches than Thompson sampling. By combining deep latent space mapping and Bayesian learning for efficient guided screening, this study demonstrated the feasibility of creating bespoke synthetic microbiomes with customized MDR profiles.
LGNov 9, 2021
Deep diffusion-based forecasting of COVID-19 by incorporating network-level mobility informationPadmaksha Roy, Shailik Sarkar, Subhodip Biswas et al.
Modeling the spatiotemporal nature of the spread of infectious diseases can provide useful intuition in understanding the time-varying aspect of the disease spread and the underlying complex spatial dependency observed in people's mobility patterns. Besides, the county level multiple related time series information can be leveraged to make a forecast on an individual time series. Adding to this challenge is the fact that real-time data often deviates from the unimodal Gaussian distribution assumption and may show some complex mixed patterns. Motivated by this, we develop a deep learning-based time-series model for probabilistic forecasting called Auto-regressive Mixed Density Dynamic Diffusion Network(ARM3Dnet), which considers both people's mobility and disease spread as a diffusion process on a dynamic directed graph. The Gaussian Mixture Model layer is implemented to consider the multimodal nature of the real-time data while learning from multiple related time series. We show that our model, when trained with the best combination of dynamic covariate features and mixture components, can outperform both traditional statistical and deep learning models in forecasting the number of Covid-19 deaths and cases at the county level in the United States.
LGJul 21, 2021
Bridging the Gap between Spatial and Spectral Domains: A Unified Framework for Graph Neural NetworksZhiqian Chen, Fanglan Chen, Lei Zhang et al.
Deep learning's performance has been extensively recognized recently. Graph neural networks (GNNs) are designed to deal with graph-structural data that classical deep learning does not easily manage. Since most GNNs were created using distinct theories, direct comparisons are impossible. Prior research has primarily concentrated on categorizing existing models, with little attention paid to their intrinsic connections. The purpose of this study is to establish a unified framework that integrates GNNs based on spectral graph and approximation theory. The framework incorporates a strong integration between spatial- and spectral-based GNNs while tightly associating approaches that exist within each respective domain.
LGFeb 27, 2020
Bridging the Gap between Spatial and Spectral Domains: A Survey on Graph Neural NetworksZhiqian Chen, Fanglan Chen, Lei Zhang et al.
Deep learning's success has been widely recognized in a variety of machine learning tasks, including image classification, audio recognition, and natural language processing. As an extension of deep learning beyond these domains, graph neural networks (GNNs) are designed to handle the non-Euclidean graph-structure which is intractable to previous deep learning techniques. Existing GNNs are presented using various techniques, making direct comparison and cross-reference more complex. Although existing studies categorize GNNs into spatial-based and spectral-based techniques, there hasn't been a thorough examination of their relationship. To close this gap, this study presents a single framework that systematically incorporates most GNNs. We organize existing GNNs into spatial and spectral domains, as well as expose the connections within each domain. A review of spectral graph theory and approximation theory builds a strong relationship across the spatial and spectral domains in further investigation.
DLMay 22, 2019
Patent Citation Dynamics Modeling via Multi-Attention Recurrent NetworksTaoran Ji, Zhiqian Chen, Nathan Self et al.
Modeling and forecasting forward citations to a patent is a central task for the discovery of emerging technologies and for measuring the pulse of inventive progress. Conventional methods for forecasting these forward citations cast the problem as analysis of temporal point processes which rely on the conditional intensity of previously received citations. Recent approaches model the conditional intensity as a chain of recurrent neural networks to capture memory dependency in hopes of reducing the restrictions of the parametric form of the intensity function. For the problem of patent citations, we observe that forecasting a patent's chain of citations benefits from not only the patent's history itself but also from the historical citations of assignees and inventors associated with that patent. In this paper, we propose a sequence-to-sequence model which employs an attention-of-attention mechanism to capture the dependencies of these multiple time sequences. Furthermore, the proposed model is able to forecast both the timestamp and the category of a patent's next citation. Extensive experiments on a large patent citation dataset collected from USPTO demonstrate that the proposed model outperforms state-of-the-art models at forward citation forecasting.
CRFeb 14, 2019
Estimating the Circuit Deobfuscating Runtime based on Graph Deep LearningZhiqian Chen, Gaurav Kolhe, Setareh Rafatirad et al.
Circuit obfuscation is a recently proposed defense mechanism to protect digital integrated circuits (ICs) from reverse engineering by using camouflaged gates i.e., logic gates whose functionality cannot be precisely determined by the attacker. There have been effective schemes such as satisfiability-checking (SAT)-based attacks that can potentially decrypt obfuscated circuits, called deobfuscation. Deobfuscation runtime could have a large span ranging from few milliseconds to thousands of years or more, depending on the number and layouts of the ICs and camouflaged gates. And hence accurately pre-estimating the deobfuscation runtime is highly crucial for the defenders to maximize it and optimize their defense. However, estimating the deobfuscation runtime is a challenging task due to 1) the complexity and heterogeneity of graph-structured circuit, 2) the unknown and sophisticated mechanisms of the attackers for deobfuscation. To address the above mentioned challenges, this work proposes the first machine-learning framework that predicts the deobfuscation runtime based on graph deep learning techniques. Specifically, we design a new model, ICNet with new input and convolution layers to characterize and extract graph frequencies from ICs, which are then integrated by heterogeneous deep fully-connected layers to obtain final output. ICNet is an end-to-end framework which can automatically extract the determinant features for deobfuscation runtime. Extensive experiments demonstrate its effectiveness and efficiency.
LGAug 30, 2018
Rational Neural Networks for Approximating Jump Discontinuities of Graph Convolution OperatorZhiqian Chen, Feng Chen, Rongjie Lai et al.
For node level graph encoding, a recent important state-of-art method is the graph convolutional networks (GCN), which nicely integrate local vertex features and graph topology in the spectral domain. However, current studies suffer from several drawbacks: (1) graph CNNs relies on Chebyshev polynomial approximation which results in oscillatory approximation at jump discontinuities; (2) Increasing the order of Chebyshev polynomial can reduce the oscillations issue, but also incurs unaffordable computational cost; (3) Chebyshev polynomials require degree $Ω$(poly(1/$ε$)) to approximate a jump signal such as $|x|$, while rational function only needs $\mathcal{O}$(poly log(1/$ε$))\cite{liang2016deep,telgarsky2017neural}. However, it's non-trivial to apply rational approximation without increasing computational complexity due to the denominator. In this paper, the superiority of rational approximation is exploited for graph signal recovering. RatioanlNet is proposed to integrate rational function and neural networks. We show that rational function of eigenvalues can be rewritten as a function of graph Laplacian, which can avoid multiplication by the eigenvector matrix. Focusing on the analysis of approximation on graph convolution operation, a graph signal regression task is formulated. Under graph signal regression task, its time complexity can be significantly reduced by graph Fourier transform. To overcome the local minimum problem of neural networks model, a relaxed Remez algorithm is utilized to initialize the weight parameters. Convergence rate of RatioanlNet and polynomial based methods on jump signal is analyzed for a theoretical guarantee. The extensive experimental results demonstrated that our approach could effectively characterize the jump discontinuities, outperforming competing methods by a substantial margin on both synthetic and real-world graphs.
LGJul 6, 2018
Distributed Self-Paced Learning in Alternating Direction Method of MultipliersXuchao Zhang, Liang Zhao, Zhiqian Chen et al.
Self-paced learning (SPL) mimics the cognitive process of humans, who generally learn from easy samples to hard ones. One key issue in SPL is the training process required for each instance weight depends on the other samples and thus cannot easily be run in a distributed manner in a large-scale dataset. In this paper, we reformulate the self-paced learning problem into a distributed setting and propose a novel Distributed Self-Paced Learning method (DSPL) to handle large-scale datasets. Specifically, both the model and instance weights can be optimized in parallel for each batch based on a consensus alternating direction method of multipliers. We also prove the convergence of our algorithm under mild conditions. Extensive experiments on both synthetic and real datasets demonstrate that our approach is superior to those of existing methods.
LGDec 5, 2017
Learning to Fuse Music Genres with Generative Adversarial Dual LearningZhiqian Chen, Chih-Wei Wu, Yen-Cheng Lu et al.
FusionGAN is a novel genre fusion framework for music generation that integrates the strengths of generative adversarial networks and dual learning. In particular, the proposed method offers a dual learning extension that can effectively integrate the styles of the given domains. To efficiently quantify the difference among diverse domains and avoid the vanishing gradient issue, FusionGAN provides a Wasserstein based metric to approximate the distance between the target domain and the existing domains. Adopting the Wasserstein distance, a new domain is created by combining the patterns of the existing domains using adversarial learning. Experimental results on public music datasets demonstrated that our approach could effectively merge two genres.
AIDec 5, 2017
Multimodal Storytelling via Generative Adversarial Imitation LearningZhiqian Chen, Xuchao Zhang, Arnold P. Boedihardjo et al.
Deriving event storylines is an effective summarization method to succinctly organize extensive information, which can significantly alleviate the pain of information overload. The critical challenge is the lack of widely recognized definition of storyline metric. Prior studies have developed various approaches based on different assumptions about users' interests. These works can extract interesting patterns, but their assumptions do not guarantee that the derived patterns will match users' preference. On the other hand, their exclusiveness of single modality source misses cross-modality information. This paper proposes a method, multimodal imitation learning via generative adversarial networks(MIL-GAN), to directly model users' interests as reflected by various data. In particular, the proposed model addresses the critical challenge by imitating users' demonstrated storylines. Our proposed model is designed to learn the reward patterns given user-provided storylines and then applies the learned policy to unseen data. The proposed approach is demonstrated to be capable of acquiring the user's implicit intent and outperforming competing methods by a substantial margin with a user study.