LGMar 19, 2023
An Efficient Subgraph GNN with Provable Substructure Counting PowerZuoyu Yan, Junru Zhou, Liangcai Gao et al. · pku
We investigate the enhancement of graph neural networks' (GNNs) representation power through their ability in substructure counting. Recent advances have seen the adoption of subgraph GNNs, which partition an input graph into numerous subgraphs, subsequently applying GNNs to each to augment the graph's overall representation. Despite their ability to identify various substructures, subgraph GNNs are hindered by significant computational and memory costs. In this paper, we tackle a critical question: Is it possible for GNNs to count substructures both \textbf{efficiently} and \textbf{provably}? Our approach begins with a theoretical demonstration that the distance to rooted nodes in subgraphs is key to boosting the counting power of subgraph GNNs. To avoid the need for repetitively applying GNN across all subgraphs, we introduce precomputed structural embeddings that encapsulate this crucial distance information. Experiments validate that our proposed model retains the counting power of subgraph GNNs while achieving significantly faster performance.
LGSep 10, 2023
Distance-Restricted Folklore Weisfeiler-Leman GNNs with Provable Cycle Counting PowerJunru Zhou, Jiarui Feng, Xiyuan Wang et al.
The ability of graph neural networks (GNNs) to count certain graph substructures, especially cycles, is important for the success of GNNs on a wide range of tasks. It has been recently used as a popular metric for evaluating the expressive power of GNNs. Many of the proposed GNN models with provable cycle counting power are based on subgraph GNNs, i.e., extracting a bag of subgraphs from the input graph, generating representations for each subgraph, and using them to augment the representation of the input graph. However, those methods require heavy preprocessing, and suffer from high time and memory costs. In this paper, we overcome the aforementioned limitations of subgraph GNNs by proposing a novel class of GNNs -- $d$-Distance-Restricted FWL(2) GNNs, or $d$-DRFWL(2) GNNs. $d$-DRFWL(2) GNNs use node pairs whose mutual distances are at most $d$ as the units for message passing to balance the expressive power and complexity. By performing message passing among distance-restricted node pairs in the original graph, $d$-DRFWL(2) GNNs avoid the expensive subgraph extraction operations in subgraph GNNs, making both the time and space complexity lower. We theoretically show that the discriminative power of $d$-DRFWL(2) GNNs strictly increases as $d$ increases. More importantly, $d$-DRFWL(2) GNNs have provably strong cycle counting power even with $d=2$: they can count all 3, 4, 5, 6-cycles. Since 6-cycles (e.g., benzene rings) are ubiquitous in organic molecules, being able to detect and count them is crucial for achieving robust and generalizable performance on molecular tasks. Experiments on both synthetic datasets and molecular datasets verify our theory. To the best of our knowledge, our model is the most efficient GNN model to date (both theoretically and empirically) that can count up to 6-cycles.
LGDec 10, 2025
Branching Strategies Based on Subgraph GNNs: A Study on Theoretical Promise versus Practical RealityJunru Zhou, Yicheng Wang, Pan Li
Graph Neural Networks (GNNs) have emerged as a promising approach for ``learning to branch'' in Mixed-Integer Linear Programming (MILP). While standard Message-Passing GNNs (MPNNs) are efficient, they theoretically lack the expressive power to fully represent MILP structures. Conversely, higher-order GNNs (like 2-FGNNs) are expressive but computationally prohibitive. In this work, we investigate Subgraph GNNs as a theoretical middle ground. Crucially, while previous work [Chen et al., 2025] demonstrated that GNNs with 3-WL expressive power can approximate Strong Branching, we prove a sharper result: node-anchored Subgraph GNNs whose expressive power is strictly lower than 3-WL [Zhang et al., 2023] are sufficient to approximate Strong Branching scores. However, our extensive empirical evaluation on four benchmark datasets reveals a stark contrast between theory and practice. While node-anchored Subgraph GNNs theoretically offer superior branching decisions, their $O(n)$ complexity overhead results in significant memory bottlenecks and slower solving times than MPNNs and heuristics. Our results indicate that for MILP branching, the computational cost of expressive GNNs currently outweighs their gains in decision quality, suggesting that future research must focus on efficiency-preserving expressivity.
LGOct 13, 2024
Towards Stable, Globally Expressive Graph Representations with Laplacian EigenvectorsJunru Zhou, Cai Zhou, Xiyuan Wang et al. · mit
Graph neural networks (GNNs) have achieved remarkable success in a variety of machine learning tasks over graph data. Existing GNNs usually rely on message passing, i.e., computing node representations by gathering information from the neighborhood, to build their underlying computational graphs. They are known fairly limited in expressive power, and often fail to capture global characteristics of graphs. To overcome the issue, a popular solution is to use Laplacian eigenvectors as additional node features, as they contain global positional information of nodes, and can serve as extra node identifiers aiding GNNs to separate structurally similar nodes. For such an approach, properly handling the orthogonal group symmetry among eigenvectors with equal eigenvalue is crucial for its stability and generalizability. However, using a naive orthogonal group invariant encoder for each separate eigenspace may not keep the full expressivity in the Laplacian eigenvectors. Moreover, computing such invariants inevitably entails a hard split of Laplacian eigenvalues according to their numerical identity, which suffers from great instability when the graph structure is perturbed. In this paper, we propose a novel method exploiting Laplacian eigenvectors to generate stable and globally expressive graph representations. The main difference from previous works is that (i) our method utilizes learnable orthogonal group invariant representations for each Laplacian eigenspace, based upon powerful orthogonal group equivariant neural network layers already well studied in the literature, and that (ii) our method deals with numerically close eigenvalues in a smooth fashion, ensuring its better robustness against perturbations. Experiments on various graph learning benchmarks witness the competitive performance of our method, especially its great potential to learn global properties of graphs.
CLMar 31, 2025
CrossFormer: Cross-Segment Semantic Fusion for Document SegmentationTongke Ni, Yang Fan, Junru Zhou et al.
Text semantic segmentation involves partitioning a document into multiple paragraphs with continuous semantics based on the subject matter, contextual information, and document structure. Traditional approaches have typically relied on preprocessing documents into segments to address input length constraints, resulting in the loss of critical semantic information across segments. To address this, we present CrossFormer, a transformer-based model featuring a novel cross-segment fusion module that dynamically models latent semantic dependencies across document segments, substantially elevating segmentation accuracy. Additionally, CrossFormer can replace rule-based chunk methods within the Retrieval-Augmented Generation (RAG) system, producing more semantically coherent chunks that enhance its efficacy. Comprehensive evaluations confirm CrossFormer's state-of-the-art performance on public text semantic segmentation datasets, alongside considerable gains on RAG benchmarks.
CLMay 20, 2021
Head-driven Phrase Structure Parsing in O($n^3$) Time ComplexityZuchao Li, Junru Zhou, Hai Zhao et al.
Constituent and dependency parsing, the two classic forms of syntactic parsing, have been found to benefit from joint training and decoding under a uniform formalism, Head-driven Phrase Structure Grammar (HPSG). However, decoding this unified grammar has a higher time complexity ($O(n^5)$) than decoding either form individually ($O(n^3)$) since more factors have to be considered during decoding. We thus propose an improved head scorer that helps achieve a novel performance-preserved parser in $O$($n^3$) time complexity. Furthermore, on the basis of this proposed practical HPSG parser, we investigated the strengths of HPSG-based parsing and explored the general method of training an HPSG-based parser from only a constituent or dependency annotations in a multilingual scenario. We thus present a more effective, more in-depth, and general work on HPSG parsing.
CLDec 27, 2020
SG-Net: Syntax Guided Transformer for Language RepresentationZhuosheng Zhang, Yuwei Wu, Junru Zhou et al.
Understanding human language is one of the key themes of artificial intelligence. For language representation, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy texts and getting rid of the noises is essential to improve its performance. Traditional attentive models attend to all words without explicit constraint, which results in inaccurate concentration on some dispensable words. In this work, we propose using syntax to guide the text modeling by incorporating explicit syntactic constraints into attention mechanisms for better linguistically motivated word representations. In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention. Syntax-guided network (SG-Net) is then composed of this extra SDOI-SAN and the SAN from the original Transformer encoder through a dual contextual architecture for better linguistics inspired representation. The proposed SG-Net is applied to typical Transformer encoders. Extensive experiments on popular benchmark tasks, including machine reading comprehension, natural language inference, and neural machine translation show the effectiveness of the proposed SG-Net design.
CLApr 28, 2020
Semantics-Aware Inferential Network for Natural Language UnderstandingShuailiang Zhang, Hai Zhao, Junru Zhou
For natural language understanding tasks, either machine reading comprehension or natural language inference, both semantics-aware and inference are favorable features of the concerned modeling for better understanding performance. Thus we propose a Semantics-Aware Inferential Network (SAIN) to meet such a motivation. Taking explicit contextualized semantics as a complementary input, the inferential module of SAIN enables a series of reasoning steps over semantic clues through an attention mechanism. By stringing these steps, the inferential network effectively learns to perform iterative reasoning which incorporates both explicit semantics and contextualized representations. In terms of well pre-trained language models as front-end encoder, our model achieves significant improvement on 11 tasks including machine reading comprehension and natural language inference.
CLNov 7, 2019
Dependency and Span, Cross-Style Semantic Role Labeling on PropBank and NomBankZuchao Li, Hai Zhao, Junru Zhou et al.
The latest developments in neural semantic role labeling (SRL) have shown great performance improvements with both the dependency and span formalisms/styles. Although the two styles share many similarities in linguistic meaning and computation, most previous studies focus on a single style. In this paper, we define a new cross-style semantic role label convention and propose a new cross-style joint optimization model designed around the most basic linguistic meaning of a semantic role, providing a solution to make the results of the two styles more comparable and allowing both formalisms of SRL to benefit from their natural connections in both linguistics and computation. Our model learns a general semantic argument structure and is capable of outputting in either style. Additionally, we propose a syntax-aided method to uniformly enhance the learning of both dependency and span representations. Experiments show that the proposed methods are effective on both span and dependency SRL benchmarks.
CLOct 31, 2019
LIMIT-BERT : Linguistic Informed Multi-Task BERTJunru Zhou, Zhuosheng Zhang, Hai Zhao et al.
In this paper, we present a Linguistic Informed Multi-Task BERT (LIMIT-BERT) for learning language representations across multiple linguistic tasks by Multi-Task Learning (MTL). LIMIT-BERT includes five key linguistic syntax and semantics tasks: Part-Of-Speech (POS) tags, constituent and dependency syntactic parsing, span and dependency semantic role labeling (SRL). Besides, LIMIT-BERT adopts linguistics mask strategy: Syntactic and Semantic Phrase Masking which mask all of the tokens corresponding to a syntactic/semantic phrase. Different from recent Multi-Task Deep Neural Networks (MT-DNN) (Liu et al., 2019), our LIMIT-BERT is linguistically motivated and learning in a semi-supervised method which provides large amounts of linguistic-task data as same as BERT learning corpus. As a result, LIMIT-BERT not only improves linguistic tasks performance but also benefits from a regularization effect and linguistic information that leads to more general representations to help adapt to new tasks and domains. LIMIT-BERT obtains new state-of-the-art or competitive results on both span and dependency semantic parsing on Propbank benchmarks and both dependency and constituent syntactic parsing on Penn Treebank.
CLAug 30, 2019
Parsing All: Syntax and Semantics, Dependencies and SpansJunru Zhou, Zuchao Li, Hai Zhao
Both syntactic and semantic structures are key linguistic contextual clues, in which parsing the latter has been well shown beneficial from parsing the former. However, few works ever made an attempt to let semantic parsing help syntactic parsing. As linguistic representation formalisms, both syntax and semantics may be represented in either span (constituent/phrase) or dependency, on both of which joint learning was also seldom explored. In this paper, we propose a novel joint model of syntactic and semantic parsing on both span and dependency representations, which incorporates syntactic information effectively in the encoder of neural network and benefits from two representation formalisms in a uniform way. The experiments show that semantics and syntax can benefit each other by optimizing joint objectives. Our single model achieves new state-of-the-art or competitive results on both span and dependency semantic parsing on Propbank benchmarks and both dependency and constituent syntactic parsing on Penn Treebank.
CLAug 18, 2019
Concurrent Parsing of Constituency and DependencyJunru Zhou, Shuailiang Zhang, Hai Zhao
Constituent and dependency representation for syntactic structure share a lot of linguistic and computational characteristics, this paper thus makes the first attempt by introducing a new model that is capable of parsing constituent and dependency at the same time, so that lets either of the parsers enhance each other. Especially, we evaluate the effect of different shared network components and empirically verify that dependency parsing may be much more beneficial from constituent parsing structure. The proposed parser achieves new state-of-the-art performance for both parsing tasks, constituent and dependency on PTB and CTB benchmarks.
CLAug 14, 2019
SG-Net: Syntax-Guided Machine Reading ComprehensionZhuosheng Zhang, Yuwei Wu, Junru Zhou et al.
For machine reading comprehension, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy passages and getting ride of the noises is essential to improve its performance. Traditional attentive models attend to all words without explicit constraint, which results in inaccurate concentration on some dispensable words. In this work, we propose using syntax to guide the text modeling by incorporating explicit syntactic constraints into attention mechanism for better linguistically motivated word representations. In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention. Syntax-guided network (SG-Net) is then composed of this extra SDOI-SAN and the SAN from the original Transformer encoder through a dual contextual architecture for better linguistics inspired representation. To verify its effectiveness, the proposed SG-Net is applied to typical pre-trained language model BERT which is right based on a Transformer encoder. Extensive experiments on popular benchmarks including SQuAD 2.0 and RACE show that the proposed SG-Net design helps achieve substantial performance improvement over strong baselines.
CLJul 5, 2019
Head-Driven Phrase Structure Grammar Parsing on Penn TreebankJunru Zhou, Hai Zhao
Head-driven phrase structure grammar (HPSG) enjoys a uniform formalism representing rich contextual syntactic and even semantic meanings. This paper makes the first attempt to formulate a simplified HPSG by integrating constituent and dependency formal representations into head-driven phrase structure. Then two parsing algorithms are respectively proposed for two converted tree representations, division span and joint span. As HPSG encodes both constituent and dependency structure information, the proposed HPSG parsers may be regarded as a sort of joint decoder for both types of structures and thus are evaluated in terms of extracted or converted constituent and dependency parsing trees. Our parser achieves new state-of-the-art performance for both parsing tasks on Penn Treebank (PTB) and Chinese Penn Treebank, verifying the effectiveness of joint learning constituent and dependency structures. In details, we report 96.33 F1 of constituent parsing and 97.20\% UAS of dependency parsing on PTB.