36.7LGMar 15
From Specification to Architecture: A Theory Compiler for Knowledge-Guided Machine LearningAsela Hevapathige, Yu Xia, Sachith Seneviratne et al.
Theory-guided machine learning has demonstrated that including authentic domain knowledge directly into model design improves performance, sample efficiency and out-of-distribution generalisation. Yet the process by which a formal domain theory is translated into architectural constraints remains entirely manual, specific to each domain formalism, and devoid of any formal correctness guarantee. This translation is non-transferable between domains, not verified, and does not scale. We propose the Theory Compiler: a system that accepts a typed, machine-readable domain theory as input and automatically produces an architecture whose function space is provably constrained to be consistent with that theory by construction, not by regularisation. We identify three foundational open problems whose resolution defines our research agenda: (1) designing a universal theory formalisation language with decidable type-checking; (2) constructing a compositionally correct compilation algorithm from theory primitives to architectural modules; and (3) establishing soundness and completeness criteria for formal verification. We further conjecture that compiled architectures match or exceed manually-designed counterparts in generalisation performance while requiring substantially less training data, a claim we ground in classical statistical learning theory. We argue that recent advances in formal machine learning theory, large language models, and the growth of an interdisciplinary research community have made this paradigm achievable for the first time.
LGMar 2
Invariant-Stratified Propagation for Expressive Graph Neural NetworksAsela Hevapathige, Ahad N. Zehmakan, Asiri Wijesinghe et al.
Graph Neural Networks (GNNs) face fundamental limitations in expressivity and capturing structural heterogeneity. Standard message-passing architectures are constrained by the 1-dimensional Weisfeiler-Leman (1-WL) test, unable to distinguish graphs beyond degree sequences, and aggregate information uniformly from neighbors, failing to capture how nodes occupy different structural positions within higher-order patterns. While methods exist to achieve higher expressivity, they incur prohibitive computational costs and lack unified frameworks for flexibly encoding diverse structural properties. To address these limitations, we introduce Invariant-Stratified Propagation (ISP), a framework comprising both a novel WL variant (ISP-WL) and its efficient neural network implementation (ISPGNN). ISP stratifies nodes according to graph invariants, processing them in hierarchical strata that reveal structural distinctions invisible to 1-WL. Through hierarchical structural heterogeneity encoding, ISP quantifies differences in nodes' structural positions within higher-order patterns, distinguishing interactions where participants occupy different roles from those with uniform participation. We provide formal theoretical analysis establishing enhanced expressivity beyond 1-WL, convergence guarantees, and inherent resistance to oversmoothing. Extensive experiments across graph classification, node classification, and influence estimation demonstrate consistent improvements over standard architectures and state-of-the-art expressive baselines.
LGNov 10, 2025
Beyond Fixed Depth: Adaptive Graph Neural Networks for Node Classification Under Varying HomophilyAsela Hevapathige, Asiri Wijesinghe, Ahad N. Zehmakan
Graph Neural Networks (GNNs) have achieved significant success in addressing node classification tasks. However, the effectiveness of traditional GNNs degrades on heterophilic graphs, where connected nodes often belong to different labels or properties. While recent work has introduced mechanisms to improve GNN performance under heterophily, certain key limitations still exist. Most existing models apply a fixed aggregation depth across all nodes, overlooking the fact that nodes may require different propagation depths based on their local homophily levels and neighborhood structures. Moreover, many methods are tailored to either homophilic or heterophilic settings, lacking the flexibility to generalize across both regimes. To address these challenges, we develop a theoretical framework that links local structural and label characteristics to information propagation dynamics at the node level. Our analysis shows that optimal aggregation depth varies across nodes and is critical for preserving class-discriminative information. Guided by this insight, we propose a novel adaptive-depth GNN architecture that dynamically selects node-specific aggregation depths using theoretically grounded metrics. Our method seamlessly adapts to both homophilic and heterophilic patterns within a unified model. Extensive experiments demonstrate that our approach consistently enhances the performance of standard GNN backbones across diverse benchmarks.
CLFeb 16
Beyond Translation: Evaluating Mathematical Reasoning Capabilities of LLMs in Sinhala and TamilSukumar Kishanthan, Kumar Thushalika, Buddhi Jayasekara et al.
Large language models (LLMs) demonstrate strong mathematical reasoning in English, but whether these capabilities reflect genuine multilingual reasoning or reliance on translation-based processing in low-resource languages like Sinhala and Tamil remains unclear. We examine this fundamental question by evaluating whether LLMs genuinely reason mathematically in these languages or depend on implicit translation to English-like representations. Using a taxonomy of six math problem types, from basic arithmetic to complex unit conflict and optimization problems, we evaluate four prominent large language models. To avoid translation artifacts that confound language ability with translation quality, we construct a parallel dataset where each problem is natively authored by fluent speakers with mathematical training in all three languages. Our analysis demonstrates that while basic arithmetic reasoning transfers robustly across languages, complex reasoning tasks show significant degradation in Tamil and Sinhala. The pattern of failures varies by model and problem type, suggesting that apparent multilingual competence may not reflect uniform reasoning capabilities across languages. These findings challenge the common assumption that models exhibiting strong multilingual performance can reason equally effectively across languages, and highlight the need for fine-grained, type-aware evaluation in multilingual settings.
LGDec 23, 2025
Orthogonal Activation with Implicit Group-Aware Bias Learning for Class ImbalanceSukumar Kishanthan, Asela Hevapathige
Class imbalance is a common challenge in machine learning and data mining, often leading to suboptimal performance in classifiers. While deep learning excels in feature extraction, its performance still deteriorates under imbalanced data. In this work, we propose a novel activation function, named OGAB, designed to alleviate class imbalance in deep learning classifiers. OGAB incorporates orthogonality and group-aware bias learning to enhance feature distinguishability in imbalanced scenarios without explicitly requiring label information. Our key insight is that activation functions can be used to introduce strong inductive biases that can address complex data challenges beyond traditional non-linearity. Our work demonstrates that orthogonal transformations can preserve information about minority classes by maintaining feature independence, thereby preventing the dominance of majority classes in the embedding space. Further, the proposed group-aware bias mechanism automatically identifies data clusters and adjusts embeddings to enhance class separability without the need for explicit supervision. Unlike existing approaches that address class imbalance through preprocessing data modifications or post-processing corrections, our proposed approach tackles class imbalance during the training phase at the embedding learning level, enabling direct integration with the learning process. We demonstrate the effectiveness of our solution on both real-world and synthetic imbalanced datasets, showing consistent performance improvements over both traditional and learnable activation functions.
LGFeb 8, 2025
Deep Learning Meets Oversampling: A Learning Framework to Handle Imbalanced ClassificationSukumar Kishanthan, Asela Hevapathige
Despite extensive research spanning several decades, class imbalance is still considered a profound difficulty for both machine learning and deep learning models. While data oversampling is the foremost technique to address this issue, traditional sampling techniques are often decoupled from the training phase of the predictive model, resulting in suboptimal representations. To address this, we propose a novel learning framework that can generate synthetic data instances in a data-driven manner. The proposed framework formulates the oversampling process as a composition of discrete decision criteria, thereby enhancing the representation power of the model's learning process. Extensive experiments on the imbalanced classification task demonstrate the superiority of our framework over state-of-the-art algorithms.
LGMar 3, 2025
Depth-Adaptive Graph Neural Networks via Learnable Bakry-'Emery CurvatureAsela Hevapathige, Ahad N. Zehmakan, Qing Wang
Graph Neural Networks (GNNs) have demonstrated strong representation learning capabilities for graph-based tasks. Recent advances on GNNs leverage geometric properties, such as curvature, to enhance its representation capabilities by modeling complex connectivity patterns and information flow within graphs. However, most existing approaches focus solely on discrete graph topology, overlooking diffusion dynamics and task-specific dependencies essential for effective learning. To address this, we propose integrating Bakry-Émery curvature, which captures both structural and task-driven aspects of information propagation. We develop an efficient, learnable approximation strategy, making curvature computation scalable for large graphs. Furthermore, we introduce an adaptive depth mechanism that dynamically adjusts message-passing layers per vertex based on its curvature, ensuring efficient propagation. Our theoretical analysis establishes a link between curvature and feature distinctiveness, showing that high-curvature vertices require fewer layers, while low-curvature ones benefit from deeper propagation. Extensive experiments on benchmark datasets validate the effectiveness of our approach, showing consistent performance improvements across diverse graph learning tasks.
LGAug 15, 2025
Graph Neural Diffusion via Generalized Opinion DynamicsAsela Hevapathige, Asiri Wijesinghe, Ahad N. Zehmakan
There has been a growing interest in developing diffusion-based Graph Neural Networks (GNNs), building on the connections between message passing mechanisms in GNNs and physical diffusion processes. However, existing methods suffer from three critical limitations: (1) they rely on homogeneous diffusion with static dynamics, limiting adaptability to diverse graph structures; (2) their depth is constrained by computational overhead and diminishing interpretability; and (3) theoretical understanding of their convergence behavior remains limited. To address these challenges, we propose GODNF, a Generalized Opinion Dynamics Neural Framework, which unifies multiple opinion dynamics models into a principled, trainable diffusion mechanism. Our framework captures heterogeneous diffusion patterns and temporal dynamics via node-specific behavior modeling and dynamic neighborhood influence, while ensuring efficient and interpretable message propagation even at deep layers. We provide a rigorous theoretical analysis demonstrating GODNF's ability to model diverse convergence configurations. Extensive empirical evaluations of node classification and influence estimation tasks confirm GODNF's superiority over state-of-the-art GNNs.
LGDec 16, 2024
DeepSN: A Sheaf Neural Framework for Influence MaximizationAsela Hevapathige, Qing Wang, Ahad N. Zehmakan
Influence maximization is key topic in data mining, with broad applications in social network analysis and viral marketing. In recent years, researchers have increasingly turned to machine learning techniques to address this problem. They have developed methods to learn the underlying diffusion processes in a data-driven manner, which enhances the generalizability of the solution, and have designed optimization objectives to identify the optimal seed set. Nonetheless, two fundamental gaps remain unsolved: (1) Graph Neural Networks (GNNs) are increasingly used to learn diffusion models, but in their traditional form, they often fail to capture the complex dynamics of influence diffusion, (2) Designing optimization objectives is challenging due to combinatorial explosion when solving this problem. To address these challenges, we propose a novel framework, DeepSN. Our framework employs sheaf neural diffusion to learn diverse influence patterns in a data-driven, end-to-end manner, providing enhanced separability in capturing diffusion characteristics. We also propose an optimization technique that accounts for overlapping influence between vertices, which helps to reduce the search space and identify the optimal seed set effectively and efficiently. Finally, we conduct extensive experiments on both synthetic and real-world datasets to demonstrate the effectiveness of our framework.
LGDec 14, 2023
Permutation-Invariant Graph Partitioning:How Graph Neural Networks Capture Structural Interactions?Asela Hevapathige, Qing Wang
Graph Neural Networks (GNNs) have paved the way for being a cornerstone in graph-related learning tasks. Yet, the ability of GNNs to capture structural interactions within graphs remains under-explored. In this work, we address this gap by drawing on the insight that permutation invariant graph partitioning enables a powerful way of exploring structural interactions. We establish theoretical connections between permutation invariant graph partitioning and graph isomorphism, and then propose Graph Partitioning Neural Networks (GPNNs), a novel architecture that efficiently enhances the expressive power of GNNs in learning structural interactions. We analyze how partitioning schemes and structural interactions contribute to GNN expressivity and their trade-offs with complexity. Empirically, we demonstrate that GPNNs outperform existing GNN models in capturing structural interactions across diverse graph benchmark tasks.
LGSep 23, 2025
Graph Neural Networks with Similarity-Navigated Probabilistic Feature CopyingAsela Hevapathige
Graph Neural Networks (GNNs) have demonstrated remarkable success across various graph-based tasks. However, they face some fundamental limitations: feature oversmoothing can cause node representations to become indistinguishable in deeper networks, they struggle to effectively manage heterogeneous relationships where connected nodes differ significantly, and they process entire feature vectors as indivisible units, which limits flexibility. We seek to address these limitations. We propose AxelGNN, a novel GNN architecture inspired by Axelrod's cultural dissemination model that addresses these limitations through a unified framework. AxelGNN incorporates similarity-gated probabilistic interactions that adaptively promote convergence or divergence based on node similarity, implements trait-level copying mechanisms for fine-grained feature aggregation at the segment level, and maintains global polarization to preserve node distinctiveness across multiple representation clusters. The model's bistable convergence dynamics naturally handle both homophilic and heterophilic graphs within a single architecture. Extensive experiments on node classification and influence estimation benchmarks demonstrate that AxelGNN consistently outperforms or matches state-of-the-art GNN methods across diverse graph structures with varying homophily-heterophily characteristics.
LGSep 8, 2025
AxelSMOTE: An Agent-Based Oversampling Algorithm for Imbalanced ClassificationSukumar Kishanthan, Asela Hevapathige
Class imbalance in machine learning poses a significant challenge, as skewed datasets often hinder performance on minority classes. Traditional oversampling techniques, which are commonly used to alleviate class imbalance, have several drawbacks: they treat features independently, lack similarity-based controls, limit sample diversity, and fail to manage synthetic variety effectively. To overcome these issues, we introduce AxelSMOTE, an innovative agent-based approach that views data instances as autonomous agents engaging in complex interactions. Based on Axelrod's cultural dissemination model, AxelSMOTE implements four key innovations: (1) trait-based feature grouping to preserve correlations; (2) a similarity-based probabilistic exchange mechanism for meaningful interactions; (3) Beta distribution blending for realistic interpolation; and (4) controlled diversity injection to avoid overfitting. Experiments on eight imbalanced datasets demonstrate that AxelSMOTE outperforms state-of-the-art sampling methods while maintaining computational efficiency.