CLJul 12, 2024
A Survey on Symbolic Knowledge Distillation of Large Language ModelsKamal Acharya, Alvaro Velasquez, Houbing Herbert Song
This survey paper delves into the emerging and critical area of symbolic knowledge distillation in Large Language Models (LLMs). As LLMs like Generative Pre-trained Transformer-3 (GPT-3) and Bidirectional Encoder Representations from Transformers (BERT) continue to expand in scale and complexity, the challenge of effectively harnessing their extensive knowledge becomes paramount. This survey concentrates on the process of distilling the intricate, often implicit knowledge contained within these models into a more symbolic, explicit form. This transformation is crucial for enhancing the interpretability, efficiency, and applicability of LLMs. We categorize the existing research based on methodologies and applications, focusing on how symbolic knowledge distillation can be used to improve the transparency and functionality of smaller, more efficient Artificial Intelligence (AI) models. The survey discusses the core challenges, including maintaining the depth of knowledge in a comprehensible format, and explores the various approaches and techniques that have been developed in this field. We identify gaps in current research and potential opportunities for future advancements. This survey aims to provide a comprehensive overview of symbolic knowledge distillation in LLMs, spotlighting its significance in the progression towards more accessible and efficient AI systems.
CRApr 26Code
CyberCane: Neuro-Symbolic RAG for Privacy-Preserving Phishing Detection with Formal Ontology ReasoningSafayat Bin Hakim, Aniqa Afzal, Qi Zhao et al.
Privacy-critical domains require phishing detection systems that satisfy contradictory constraints: near-zero false positives to prevent workflow disruption, transparent explanations for non-expert staff, strict regulatory compliance prohibiting sensitive data exposure to external APIs, and robustness against AI-generated attacks. Existing rule-based systems are brittle to novel campaigns, while LLM-based detectors violate privacy regulations through unredacted data transmission. We introduce CyberCane, a neuro-symbolic framework integrating deterministic symbolic analysis with privacy-preserving retrieval-augmented generation (RAG). Our dual-phase pipeline applies lightweight symbolic rules to email metadata, then escalates borderline cases to semantic classification via RAG with automated sensitive data redaction and retrieval from a phishing-only corpus. We further introduce PhishOnt, an OWL ontology enabling verifiable attack classification through formal reasoning chains. Evaluation on DataPhish2025 (12.3k emails; mixed human/LLM) and Nazario/SpamAssassin demonstrates a 78.6-point recall gain over symbolic-only detection on AI-generated threats, with precision exceeding 98% and FPR as low as 0.16%. Healthcare deployment projects a 542x ROI; tunable operating points support diverse risk tolerances, with open-source implementation at https://github.com/sbhakim/Cybercane.
AIJan 17, 2025Code
Topology-Driven Attribute Recovery for Attribute Missing Graph Learning in Social Internet of ThingsMengran Li, Junzhou Chen, Chenyun Yu et al.
With the advancement of information technology, the Social Internet of Things (SIoT) has fostered the integration of physical devices and social networks, deepening the study of complex interaction patterns. Text Attribute Graphs (TAGs) capture both topological structures and semantic attributes, enhancing the analysis of complex interactions within the SIoT. However, existing graph learning methods are typically designed for complete attributed graphs, and the common issue of missing attributes in Attribute Missing Graphs (AMGs) increases the difficulty of analysis tasks. To address this, we propose the Topology-Driven Attribute Recovery (TDAR) framework, which leverages topological data for AMG learning. TDAR introduces an improved pre-filling method for initial attribute recovery using native graph topology. Additionally, it dynamically adjusts propagation weights and incorporates homogeneity strategies within the embedding space to suit AMGs' unique topological structures, effectively reducing noise during information propagation. Extensive experiments on public datasets demonstrate that TDAR significantly outperforms state-of-the-art methods in attribute reconstruction and downstream tasks, offering a robust solution to the challenges posed by AMGs. The code is available at https://github.com/limengran98/TDAR.
AIJan 15, 2025Code
ANSR-DT: An Adaptive Neuro-Symbolic Learning and Reasoning Framework for Digital TwinsSafayat Bin Hakim, Muhammad Adil, Alvaro Velasquez et al.
In this paper, we propose an Adaptive Neuro-Symbolic Learning and Reasoning Framework for digital twin technology called ``ANSR-DT." Digital twins in industrial environments often struggle with interpretability, real-time adaptation, and human input integration. Our approach addresses these challenges by combining CNN-LSTM dynamic event detection with reinforcement learning and symbolic reasoning to enable adaptive intelligence with interpretable decision processes. This integration enhances environmental understanding while promoting continuous learning, leading to more effective real-time decision-making in human-machine collaborative applications. We evaluated ANSR-DT on synthetic industrial data, observing significant improvements over traditional approaches, with up to 99.5% accuracy for dynamic pattern recognition. The framework demonstrated superior adaptability with extended reinforcement learning training, improving explained variance from 0.447 to 0.547. Future work aims at scaling to larger datasets to test rule management beyond the current 14 rules. Our open-source implementation promotes reproducibility and establishes a foundation for future research in adaptive, interpretable digital twins for industrial applications.
CRSep 28, 2024
Decoding Android Malware with a Fraction of Features: An Attention-Enhanced MLP-SVM ApproachSafayat Bin Hakim, Muhammad Adil, Kamal Acharya et al.
The escalating sophistication of Android malware poses significant challenges to traditional detection methods, necessitating innovative approaches that can efficiently identify and classify threats with high precision. This paper introduces a novel framework that synergistically integrates an attention-enhanced Multi-Layer Perceptron (MLP) with a Support Vector Machine (SVM) to make Android malware detection and classification more effective. By carefully analyzing a mere 47 features out of over 9,760 available in the comprehensive CCCS-CIC-AndMal-2020 dataset, our MLP-SVM model achieves an impressive accuracy over 99% in identifying malicious applications. The MLP, enhanced with an attention mechanism, focuses on the most discriminative features and further reduces the 47 features to only 14 components using Linear Discriminant Analysis (LDA). Despite this significant reduction in dimensionality, the SVM component, equipped with an RBF kernel, excels in mapping these components to a high-dimensional space, facilitating precise classification of malware into their respective families. Rigorous evaluations, encompassing accuracy, precision, recall, and F1-score metrics, confirm the superiority of our approach compared to existing state-of-the-art techniques. The proposed framework not only significantly reduces the computational complexity by leveraging a compact feature set but also exhibits resilience against the evolving Android malware landscape.
AIMay 4
ANNEAL: Adapting LLM Agents via Governed Symbolic Patch LearningSafayat Bin Hakim, Keyan Guo, Wenkai Tan et al.
LLM-based agents can recover from individual execution errors, yet they repeatedly fail on the same fault when the underlying process knowledge--operator schemas, preconditions, and constraints--remains unrepaired. Existing self-evolving approaches address this gap by updating prompts, memory, or model weights, but none directly repair the symbolic structures that encode how tasks are executed, and few provide the governance guarantees required for safe deployment. We introduce ANNEAL, a neuro-symbolic agent that converts recurring failures into governed symbolic edits of a process knowledge graph without modifying foundation model weights. Its core mechanism, Failure-Driven Knowledge Acquisition (FDKA), localizes the responsible operator, synthesizes a typed patch through constrained LLM generation, and validates the proposal via multi-dimensional scoring, symbolic guardrails, and canary testing before commit. Every accepted edit carries full provenance and deterministic rollback capability. Across four domains and 27 multi-seed runs, ANNEAL is the only evaluated system that commits persistent structural repairs--strong baselines such as ReAct and Reflexion achieve high episodic recovery yet retain 72-100% holdout failure rates on recurring faults, whereas ANNEAL reduces these to 0% in the tested recurring-failure settings. Ablation confirms that removing FDKA eliminates all structural repairs and drops success rate by up to 26.7 percentage points. These results suggest that governed symbolic repair offers a complementary paradigm to weight-level and prompt-level adaptation for persistent fault elimination.
AIJan 6, 2024
A Survey on Verification and Validation, Testing and Evaluations of Neurosymbolic Artificial IntelligenceJustus Renkhoff, Ke Feng, Marc Meier-Doernberg et al.
Neurosymbolic artificial intelligence (AI) is an emerging branch of AI that combines the strengths of symbolic AI and sub-symbolic AI. A major drawback of sub-symbolic AI is that it acts as a "black box", meaning that predictions are difficult to explain, making the testing & evaluation (T&E) and validation & verification (V&V) processes of a system that uses sub-symbolic AI a challenge. Since neurosymbolic AI combines the advantages of both symbolic and sub-symbolic AI, this survey explores how neurosymbolic applications can ease the V&V process. This survey considers two taxonomies of neurosymbolic AI, evaluates them, and analyzes which algorithms are commonly used as the symbolic and sub-symbolic components in current applications. Additionally, an overview of current techniques for the T&E and V&V processes of these components is provided. Furthermore, it is investigated how the symbolic part is used for T&E and V&V purposes in current neurosymbolic applications. Our research shows that neurosymbolic AI as great potential to ease the T&E and V&V processes of sub-symbolic AI by leveraging the possibilities of symbolic AI. Additionally, the applicability of current T&E and V&V methods to neurosymbolic AI is assessed, and how different neurosymbolic architectures can impact these methods is explored. It is found that current T&E and V&V techniques are partly sufficient to test, evaluate, verify, or validate the symbolic and sub-symbolic part of neurosymbolic applications independently, while some of them use approaches where current T&E and V&V methods are not applicable by default, and adjustments or even new approaches are needed. Our research shows that there is great potential in using symbolic AI to test, evaluate, verify, or validate the predictions of a sub-symbolic model, making neurosymbolic AI an interesting research direction for safe, secure, and trustworthy AI.
LGJan 7, 2025
KGIF: Optimizing Relation-Aware Recommendations with Knowledge Graph Information FusionDong Hyun Jeon, Wenbo Sun, Houbing Herbert Song et al.
While deep-learning-enabled recommender systems demonstrate strong performance benchmarks, many struggle to adapt effectively in real-world environments due to limited use of user-item relationship data and insufficient transparency in recommendation generation. Traditional collaborative filtering approaches fail to integrate multifaceted item attributes, and although Factorization Machines account for item-specific details, they overlook broader relational patterns. Collaborative knowledge graph-based models have progressed by embedding user-item interactions with item-attribute relationships, offering a holistic perspective on interconnected entities. However, these models frequently aggregate attribute and interaction data in an implicit manner, leaving valuable relational nuances underutilized. This study introduces the Knowledge Graph Attention Network with Information Fusion (KGIF), a specialized framework designed to merge entity and relation embeddings explicitly through a tailored self-attention mechanism. The KGIF framework integrates reparameterization via dynamic projection vectors, enabling embeddings to adaptively represent intricate relationships within knowledge graphs. This explicit fusion enhances the interplay between user-item interactions and item-attribute relationships, providing a nuanced balance between user-centric and item-centric representations. An attentive propagation mechanism further optimizes knowledge graph embeddings, capturing multi-layered interaction patterns. The contributions of this work include an innovative method for explicit information fusion, improved robustness for sparse knowledge graphs, and the ability to generate explainable recommendations through interpretable path visualization.
LGMay 18, 2025
AFCL: Analytic Federated Continual Learning for Spatio-Temporal Invariance of Non-IID DataJianheng Tang, Huiping Zhuang, Jingyu He et al.
Federated Continual Learning (FCL) enables distributed clients to collaboratively train a global model from online task streams in dynamic real-world scenarios. However, existing FCL methods face challenges of both spatial data heterogeneity among distributed clients and temporal data heterogeneity across online tasks. Such data heterogeneity significantly degrades the model performance with severe spatial-temporal catastrophic forgetting of local and past knowledge. In this paper, we identify that the root cause of this issue lies in the inherent vulnerability and sensitivity of gradients to non-IID data. To fundamentally address this issue, we propose a gradient-free method, named Analytic Federated Continual Learning (AFCL), by deriving analytical (i.e., closed-form) solutions from frozen extracted features. In local training, our AFCL enables single-epoch learning with only a lightweight forward-propagation process for each client. In global aggregation, the server can recursively and efficiently update the global model with single-round aggregation. Theoretical analyses validate that our AFCL achieves spatio-temporal invariance of non-IID data. This ideal property implies that, regardless of how heterogeneous the data are distributed across local clients and online tasks, the aggregated model of our AFCL remains invariant and identical to that of centralized joint learning. Extensive experiments show the consistent superiority of our AFCL over state-of-the-art baselines across various benchmark datasets and settings.
LGMay 18, 2025
ACU: Analytic Continual Unlearning for Efficient and Exact Forgetting with Privacy PreservationJianheng Tang, Huiping Zhuang, Di Fang et al.
The development of artificial intelligence demands that models incrementally update knowledge by Continual Learning (CL) to adapt to open-world environments. To meet privacy and security requirements, Continual Unlearning (CU) emerges as an important problem, aiming to sequentially forget particular knowledge acquired during the CL phase. However, existing unlearning methods primarily focus on single-shot joint forgetting and face significant limitations when applied to CU. First, most existing methods require access to the retained dataset for re-training or fine-tuning, violating the inherent constraint in CL that historical data cannot be revisited. Second, these methods often suffer from a poor trade-off between system efficiency and model fidelity, making them vulnerable to being overwhelmed or degraded by adversaries through deliberately frequent requests. In this paper, we identify that the limitations of existing unlearning methods stem fundamentally from their reliance on gradient-based updates. To bridge the research gap at its root, we propose a novel gradient-free method for CU, named Analytic Continual Unlearning (ACU), for efficient and exact forgetting with historical data privacy preservation. In response to each unlearning request, our ACU recursively derives an analytical (i.e., closed-form) solution in an interpretable manner using the least squares method. Theoretical and experimental evaluations validate the superiority of our ACU on unlearning effectiveness, model fidelity, and system efficiency.
LGOct 21, 2024
TS-ACL: Closed-Form Solution for Time Series-oriented Continual LearningJiaxu Li, Kejia Fan, Songning Lai et al.
Time series classification underpins critical applications such as healthcare diagnostics and gesture-driven interactive systems in multimedia scenarios. However, time series class-incremental learning (TSCIL) faces two major challenges: catastrophic forgetting and intra-class variations. Catastrophic forgetting occurs because gradient-based parameter update strategies inevitably erase past knowledge. And unlike images, time series data exhibits subject-specific patterns, also known as intra-class variations, which refer to differences in patterns observed within the same class. While exemplar-based methods fail to cover diverse variation with limited samples, existing exemplar-free methods lack explicit mechanisms to handle intra-class variations. To address these two challenges, we propose TS-ACL, which leverages a gradient-free closed-form solution to avoid the catastrophic forgetting problem inherent in gradient-based optimization methods while simultaneously learning global distributions to resolve intra-class variations. Additionally, it provides privacy protection and efficiency. Extensive experiments on five benchmark datasets covering various sensor modalities and tasks demonstrate that TS-ACL achieves performance close to joint training on four datasets, outperforming existing methods and establishing a new state-of-the-art (SOTA) for TSCIL.
LGSep 29, 2025
Leveraging Vulnerabilities in Temporal Graph Neural Networks via Strategic High-Impact AssaultsDong Hyun Jeon, Lijing Zhu, Haifang Li et al.
Temporal Graph Neural Networks (TGNNs) have become indispensable for analyzing dynamic graphs in critical applications such as social networks, communication systems, and financial networks. However, the robustness of TGNNs against adversarial attacks, particularly sophisticated attacks that exploit the temporal dimension, remains a significant challenge. Existing attack methods for Spatio-Temporal Dynamic Graphs (STDGs) often rely on simplistic, easily detectable perturbations (e.g., random edge additions/deletions) and fail to strategically target the most influential nodes and edges for maximum impact. We introduce the High Impact Attack (HIA), a novel restricted black-box attack framework specifically designed to overcome these limitations and expose critical vulnerabilities in TGNNs. HIA leverages a data-driven surrogate model to identify structurally important nodes (central to network connectivity) and dynamically important nodes (critical for the graph's temporal evolution). It then employs a hybrid perturbation strategy, combining strategic edge injection (to create misleading connections) and targeted edge deletion (to disrupt essential pathways), maximizing TGNN performance degradation. Importantly, HIA minimizes the number of perturbations to enhance stealth, making it more challenging to detect. Comprehensive experiments on five real-world datasets and four representative TGNN architectures (TGN, JODIE, DySAT, and TGAT) demonstrate that HIA significantly reduces TGNN accuracy on the link prediction task, achieving up to a 35.55% decrease in Mean Reciprocal Rank (MRR) - a substantial improvement over state-of-the-art baselines. These results highlight fundamental vulnerabilities in current STDG models and underscore the urgent need for robust defenses that account for both structural and temporal dynamics.
CRSep 8, 2025
Neuro-Symbolic AI for Cybersecurity: State of the Art, Challenges, and OpportunitiesSafayat Bin Hakim, Muhammad Adil, Alvaro Velasquez et al.
Traditional Artificial Intelligence (AI) approaches in cybersecurity exhibit fundamental limitations: inadequate conceptual grounding leading to non-robustness against novel attacks; limited instructibility impeding analyst-guided adaptation; and misalignment with cybersecurity objectives. Neuro-Symbolic (NeSy) AI has emerged with the potential to revolutionize cybersecurity AI. However, there is no systematic understanding of this emerging approach. These hybrid systems address critical cybersecurity challenges by combining neural pattern recognition with symbolic reasoning, enabling enhanced threat understanding while introducing concerning autonomous offensive capabilities that reshape threat landscapes. In this survey, we systematically characterize this field by analyzing 127 publications spanning 2019-July 2025. We introduce a Grounding-Instructibility-Alignment (G-I-A) framework to evaluate these systems, focusing on both cyber defense and cyber offense across network security, malware analysis, and cyber operations. Our analysis shows advantages of multi-agent NeSy architectures and identifies critical implementation challenges including standardization gaps, computational complexity, and human-AI collaboration requirements that constrain deployment. We show that causal reasoning integration is the most transformative advancement, enabling proactive defense beyond correlation-based approaches. Our findings highlight dual-use implications where autonomous systems demonstrate substantial capabilities in zero-day exploitation while achieving significant cost reductions, altering threat dynamics. We provide insights and future research directions, emphasizing the urgent need for community-driven standardization frameworks and responsible development practices that ensure advancement serves defensive cybersecurity objectives while maintaining societal alignment.
LGAug 14, 2025
APFL: Analytic Personalized Federated Learning via Dual-Stream Least SquaresKejia Fan, Jianheng Tang, Zhirui Yang et al.
Personalized Federated Learning (PFL) has presented a significant challenge to deliver personalized models to individual clients through collaborative training. Existing PFL methods are often vulnerable to non-IID data, which severely hinders collective generalization and then compromises the subsequent personalization efforts. In this paper, to address this non-IID issue in PFL, we propose an Analytic Personalized Federated Learning (APFL) approach via dual-stream least squares. In our APFL, we use a foundation model as a frozen backbone for feature extraction. Subsequent to the feature extractor, we develop dual-stream analytic models to achieve both collective generalization and individual personalization. Specifically, our APFL incorporates a shared primary stream for global generalization across all clients, and a dedicated refinement stream for local personalization of each individual client. The analytical solutions of our APFL enable its ideal property of heterogeneity invariance, theoretically meaning that each personalized model remains identical regardless of how heterogeneous the data are distributed across all other clients. Empirical results across various datasets also validate the superiority of our APFL over state-of-the-art baselines, with advantages of at least 1.10%-15.45% in accuracy.
LGAug 6, 2025
FedHiP: Heterogeneity-Invariant Personalized Federated Learning Through Closed-Form SolutionsJianheng Tang, Zhirui Yang, Jingchao Wang et al.
Lately, Personalized Federated Learning (PFL) has emerged as a prevalent paradigm to deliver personalized models by collaboratively training while simultaneously adapting to each client's local applications. Existing PFL methods typically face a significant challenge due to the ubiquitous data heterogeneity (i.e., non-IID data) across clients, which severely hinders convergence and degrades performance. We identify that the root issue lies in the long-standing reliance on gradient-based updates, which are inherently sensitive to non-IID data. To fundamentally address this issue and bridge the research gap, in this paper, we propose a Heterogeneity-invariant Personalized Federated learning scheme, named FedHiP, through analytical (i.e., closed-form) solutions to avoid gradient-based updates. Specifically, we exploit the trend of self-supervised pre-training, leveraging a foundation model as a frozen backbone for gradient-free feature extraction. Following the feature extractor, we further develop an analytic classifier for gradient-free training. To support both collective generalization and individual personalization, our FedHiP scheme incorporates three phases: analytic local training, analytic global aggregation, and analytic local personalization. The closed-form solutions of our FedHiP scheme enable its ideal property of heterogeneity invariance, meaning that each personalized model remains identical regardless of how non-IID the data are distributed across all other clients. Extensive experiments on benchmark datasets validate the superiority of our FedHiP scheme, outperforming the state-of-the-art baselines by at least 5.79%-20.97% in accuracy.
AIJun 15, 2025
SymRAG: Efficient Neuro-Symbolic Retrieval Through Adaptive Query RoutingSafayat Bin Hakim, Muhammad Adil, Alvaro Velasquez et al.
Current Retrieval-Augmented Generation systems use uniform processing, causing inefficiency as simple queries consume resources similar to complex multi-hop tasks. We present SymRAG, a framework that introduces adaptive query routing via real-time complexity and load assessment to select symbolic, neural, or hybrid pathways. SymRAG's neuro-symbolic approach adjusts computational pathways based on both query characteristics and system load, enabling efficient resource allocation across diverse query types. By combining linguistic and structural query properties with system load metrics, SymRAG allocates resources proportional to reasoning requirements. Evaluated on 2,000 queries across HotpotQA (multi-hop reasoning) and DROP (discrete reasoning) using Llama-3.2-3B and Mistral-7B models, SymRAG achieves competitive accuracy (97.6--100.0% exact match) with efficient resource utilization (3.6--6.2% CPU utilization, 0.985--3.165s processing). Disabling adaptive routing increases processing time by 169--1151%, showing its significance for complex models. These results suggest adaptive computation strategies are more sustainable and scalable for hybrid AI systems that use dynamic routing and neuro-symbolic frameworks.