CRJul 25, 2023
Blockchain-based Optimized Client Selection and Privacy Preserved Framework for Federated LearningAttia Qammar, Abdenacer Naouri, Jianguo Ding et al.
Federated learning is a distributed mechanism that trained large-scale neural network models with the participation of multiple clients and data remains on their devices, only sharing the local model updates. With this feature, federated learning is considered a secure solution for data privacy issues. However, the typical FL structure relies on the client-server, which leads to the single-point-of-failure (SPoF) attack, and the random selection of clients for model training compromised the model accuracy. Furthermore, adversaries try for inference attacks i.e., attack on privacy leads to gradient leakage attacks. We proposed the blockchain-based optimized client selection and privacy-preserved framework in this context. We designed the three kinds of smart contracts such as 1) registration of clients 2) forward bidding to select optimized clients for FL model training 3) payment settlement and reward smart contracts. Moreover, fully homomorphic encryption with Cheon, Kim, Kim, and Song (CKKS) method is implemented before transmitting the local model updates to the server. Finally, we evaluated our proposed method on the benchmark dataset and compared it with state-of-the-art studies. Consequently, we achieved a higher accuracy rate and privacy-preserved FL framework with decentralized nature.
CYApr 6
Cyberbullying Detection: Exploring Datasets, Technologies, and Approaches on Social Media PlatformsAdamu Gaston Philipo, Doreen Sebastian Sarwatt, Jianguo Ding et al.
Cyberbullying has been a significant challenge in the digital era world, given the huge number of people, especially adolescents, who use social media platforms to communicate and share information. Some individuals exploit these platforms to embarrass others through direct messages, electronic mail, speech, and public posts. This behavior has direct psychological and physical impacts on victims of bullying. While several studies have been conducted in this field and various solutions proposed to detect, prevent, and monitor cyberbullying instances on social media platforms, the problem continues. Therefore, it is necessary to conduct intensive studies and provide effective solutions to address the situation. These solutions should be based on detection, prevention, and prediction criteria methods. This paper presents a comprehensive systematic review of studies conducted on cyberbullying detection. It explores existing studies, proposed solutions, identified gaps, datasets, technologies, approaches, challenges, and recommendations, and then proposes effective solutions to address research gaps in future studies.
CLDec 24, 2025
LLM-Driven Preference Data Synthesis for Proactive Prediction of the Next User Utterance in Human-Machine DialogueJinqiang Wang, Huansheng Ning, Jianguo Ding et al.
Proactively predicting a users next utterance in human-machine dialogue can streamline interaction and improve user experience. Existing commercial API-based solutions are subject to privacy concerns while deploying general-purpose LLMs locally remains computationally expensive. As such, training a compact, task-specific LLM provides a practical alternative. Although user simulator methods can predict a user's next utterance, they mainly imitate their speaking style rather than advancing the dialogue. Preference data synthesis has been investigated to generate data for proactive next utterance prediction and help align LLMs with user preferences. Yet existing methods lack the ability to explicitly model the intent reasoning that leads to the user's next utterance and to define and synthesize preference and non-preference reasoning processes for predicting the user's next utterance.To address these challenges, we propose ProUtt, an LLM-driven preference data synthesis method for proactive next utterance prediction. ProUtt converts dialogue history into an intent tree and explicitly models intent reasoning trajectories by predicting the next plausible path from both exploitation and exploration perspectives. It then constructs preference and non-preference reasoning processes by perturbing or revising intent tree paths at different future turns. Extensive evaluations using LLM-as-a-judge and human judgments demonstrate that ProUtt consistently outperforms existing data synthesis methods, user simulators, and commercial LLM APIs across four benchmark datasets. We release both the code and the synthesized datasets to facilitate future research.
AIMar 19
An Onto-Relational-Sophic Framework for Governing Synthetic MindsHuansheng Ning, Jianguo Ding
The rapid evolution of artificial intelligence, from task-specific systems to foundation models exhibiting broad, flexible competence across reasoning, creative synthesis, and social interaction, has outpaced the conceptual and governance frameworks designed to manage it. Current regulatory paradigms, anchored in a tool-centric worldview, address algorithmic bias and transparency but leave unanswered foundational questions about what increasingly capable synthetic minds are, how societies should relate to them, and the normative principles that should guide their development. Here we introduce the Onto-Relational-Sophic (ORS) framework, grounded in Cyberism philosophy, which offers integrated answers to these challenges through three pillars: (1) a Cyber-Physical-Social-Thinking (CPST) ontology that defines the mode of being for synthetic minds as irreducibly multi-dimensional rather than purely computational; (2) a graded spectrum of digital personhood providing a pragmatic relational taxonomy beyond binary person-or-tool classifications; and (3) Cybersophy, a wisdom-oriented axiology synthesizing virtue ethics, consequentialism, and relational approaches to guide governance. We apply the framework to emergent scenarios including autonomous research agents, AI-mediated healthcare, and agentic AI ecosystems, demonstrating its capacity to generate proportionate, adaptive governance recommendations. The ORS framework charts a path from narrow technical alignment toward comprehensive philosophical foundations for the synthetic minds already among us.
ETMar 18
Cyberlanguage: Native Communication for the Cyber-Physical-Social-Thinking Fusion SpaceHuansheng Ning, Jianguo Ding
Human communication is undergoing a fundamental paradigm shift. Physical space, social relations, mental states, and digital information are converging into a unified cyber-physical-social-thinking (CPST) fusion space, rendering them no longer separable domains. However, all existing communication systems, including natural and programming languages, as well as interaction protocols, were designed for a world in which these four dimensions remained distinct. We introduce Cyberlanguage, a theoretically grounded communicative framework that is native to the CPST fusion space. Grounded in the philosophical orientation of cyberism and employing CPST theory as an analytical framework, Cyberlanguage possesses four core characteristics: native four-dimensional fusion, multi-agent universality, dynamic compilability, and contextual adaptability. We have constructed a semiotic model based on the Cybersign unit, a four-dimensional synchronous grammar, a five-layer architectural stack, and a context-driven pragmatic mechanism. We also present testable empirical predictions and a staged implementation roadmap. Cyberlanguage is not intended to replace natural or programming languages, but rather to serve as a meta-communication infrastructure capable of coordinating heterogeneous agents, humans, artificial intelligences, robots, and digital entities, within an increasingly fused cyber-physical-social-cognitive reality.
ETApr 7
Beyond Tools and Persons: Who Are They? Classifying Robots and AI Agents for Proportional GovernanceHuansheng Ning, Jianguo Ding
The rapid commercialization of humanoid robots and generative AI agents is outpacing legal frameworks built on a binary distinction between ``tools'' and ``persons.'' Current regulations, including the EU AI Act, classify systems by risk level but lack a foundational ontology for determining \emph{what kind of entity} an autonomous system is -- and what governance follows from that determination. We propose a classification framework grounded in Cyber-Physical-Social-Thinking (CPST) space theory, which categorizes autonomous entities by their degree of integration across four interconnected dimensions: computational, embodied, relational, and cognitive. The resulting three-tier taxonomy -- Confined Actors, Socially-Aware Interactors, and CPST-Integrated Agents -- provides principled scaffolding for proportional governance: enhanced product liability for isolated systems, relational duties of care for interactive companions, and qualified legal personhood for deeply integrated agents. We operationalize this taxonomy by identifying standardized assessment metrics drawn from robotics, human--robot interaction research, social computing, and cognitive science, and we propose a composite assessment protocol for regulatory use. We further address temporal dynamics -- how entities transition between categories as they evolve -- and the institutional design necessary for credible classification. We call for international standardization of this taxonomy before the 2027 review of the EU AI Act, and outline three concrete policy steps toward implementation.
CRMar 13
Architectural Selection Framework for Synthetic Network Traffic: Quantifying the Fidelity-Utility Trade-offDure Adan Ammara, Jianguo Ding, Kurt Tutschku
The fidelity and utility of synthetic network traffic are critically compromised by architectural mismatch across heterogeneous network datasets and prevalent scalability failure. This study addresses this challenge by establishing an Architectural Selection Framework that empirically quantifies how data structure compatibility dictates the optimal fidelity-utility trade-off. We systematically evaluate twelve generative architectures (both non-AI and AI) across two distinct data structure types: categorical-heavy NSL-KDD and continuous-flow-heavy CIC-IDS2017. Fidelity is rigorously assessed through three structural metrics (Data Structure, Correlation, and Probability Distribution Difference) to confirm structural realism before evaluating downstream utility. Our results, confirmed over twenty independent runs (N=20), demonstrate that GAN-based models (CTGAN, CopulaGAN) exhibit superior architectural robustness, consistently achieving the optimal balance of statistical fidelity and practical utility. Conversely, the framework exposes critical failure modes, i.e., statistical methods compromise structural fidelity for utility (Compromised fidelity), and modern iterative architectures, such as Diffusion Models, face prohibitive computational barriers, rendering them impractical for large-scale security deployment. This contribution provides security practitioners with an evidence-based guide for mitigating architectural failures, thereby setting a benchmark for reliable and scalable synthetic data deployment in adaptive security solutions.
CLDec 27, 2024
Assessing Text Classification Methods for Cyberbullying Detection on Social Media PlatformsAdamu Gaston Philipo, Doreen Sebastian Sarwatt, Jianguo Ding et al.
Cyberbullying significantly contributes to mental health issues in communities by negatively impacting the psychology of victims. It is a prevalent problem on social media platforms, necessitating effective, real-time detection and monitoring systems to identify harmful messages. However, current cyberbullying detection systems face challenges related to performance, dataset quality, time efficiency, and computational costs. This research aims to conduct a comparative study by adapting and evaluating existing text classification techniques within the cyberbullying detection domain. The study specifically evaluates the effectiveness and performance of these techniques in identifying cyberbullying instances on social media platforms. It focuses on leveraging and assessing large language models, including BERT, RoBERTa, XLNet, DistilBERT, and GPT-2.0, for their suitability in this domain. The results show that BERT strikes a balance between performance, time efficiency, and computational resources: Accuracy of 95%, Precision of 95%, Recall of 95%, F1 Score of 95%, Error Rate of 5%, Inference Time of 0.053 seconds, RAM Usage of 35.28 MB, CPU/GPU Usage of 0.4%, and Energy Consumption of 0.000263 kWh. The findings demonstrate that generative AI models, while powerful, do not consistently outperform fine-tuned models on the tested benchmarks. However, state-of-the-art performance can still be achieved through strategic adaptation and fine-tuning of existing models for specific datasets and tasks.
NIJun 24, 2025
AGI Enabled Solutions For IoX Layers Bottlenecks In Cyber-Physical-Social-Thinking SpaceAmar Khelloufi, Huansheng Ning, Sahraoui Dhelim et al.
The integration of the Internet of Everything (IoX) and Artificial General Intelligence (AGI) has given rise to a transformative paradigm aimed at addressing critical bottlenecks across sensing, network, and application layers in Cyber-Physical-Social Thinking (CPST) ecosystems. In this survey, we provide a systematic and comprehensive review of AGI-enhanced IoX research, focusing on three key components: sensing-layer data management, network-layer protocol optimization, and application-layer decision-making frameworks. Specifically, this survey explores how AGI can mitigate IoX bottlenecks challenges by leveraging adaptive sensor fusion, edge preprocessing, and selective attention mechanisms at the sensing layer, while resolving network-layer issues such as protocol heterogeneity and dynamic spectrum management, neuro-symbolic reasoning, active inference, and causal reasoning, Furthermore, the survey examines AGI-enabled frameworks for managing identity and relationship explosion. Key findings suggest that AGI-driven strategies, such as adaptive sensor fusion, edge preprocessing, and semantic modeling, offer novel solutions to sensing-layer data overload, network-layer protocol heterogeneity, and application-layer identity explosion. The survey underscores the importance of cross-layer integration, quantum-enabled communication, and ethical governance frameworks for future AGI-enabled IoX systems. Finally, the survey identifies unresolved challenges, such as computational requirements, scalability, and real-world validation, calling for further research to fully realize AGI's potential in addressing IoX bottlenecks. we believe AGI-enhanced IoX is emerging as a critical research field at the intersection of interconnected systems and advanced AI.
CRSep 24, 2025
Adversarial Defense in Cybersecurity: A Systematic Review of GANs for Threat Detection and MitigationTharcisse Ndayipfukamiye, Jianguo Ding, Doreen Sebastian Sarwatt et al.
Machine learning-based cybersecurity systems are highly vulnerable to adversarial attacks, while Generative Adversarial Networks (GANs) act as both powerful attack enablers and promising defenses. This survey systematically reviews GAN-based adversarial defenses in cybersecurity (2021--August 31, 2025), consolidating recent progress, identifying gaps, and outlining future directions. Using a PRISMA-compliant systematic literature review protocol, we searched five major digital libraries. From 829 initial records, 185 peer-reviewed studies were retained and synthesized through quantitative trend analysis and thematic taxonomy development. We introduce a four-dimensional taxonomy spanning defensive function, GAN architecture, cybersecurity domain, and adversarial threat model. GANs improve detection accuracy, robustness, and data utility across network intrusion detection, malware analysis, and IoT security. Notable advances include WGAN-GP for stable training, CGANs for targeted synthesis, and hybrid GAN models for improved resilience. Yet, persistent challenges remain such as instability in training, lack of standardized benchmarks, high computational cost, and limited explainability. GAN-based defenses demonstrate strong potential but require advances in stable architectures, benchmarking, transparency, and deployment. We propose a roadmap emphasizing hybrid models, unified evaluation, real-world integration, and defenses against emerging threats such as LLM-driven cyberattacks. This survey establishes the foundation for scalable, trustworthy, and adaptive GAN-powered defenses.
IRJul 7, 2025
A Query-Aware Multi-Path Knowledge Graph Fusion Approach for Enhancing Retrieval-Augmented Generation in Large Language ModelsQikai Wei, Huansheng Ning, Chunlong Han et al.
Retrieval Augmented Generation (RAG) has gradually emerged as a promising paradigm for enhancing the accuracy and factual consistency of content generated by large language models (LLMs). However, existing RAG studies primarily focus on retrieving isolated segments using similarity-based matching methods, while overlooking the intrinsic connections between them. This limitation hampers performance in RAG tasks. To address this, we propose QMKGF, a Query-Aware Multi-Path Knowledge Graph Fusion Approach for Enhancing Retrieval Augmented Generation. First, we design prompt templates and employ general-purpose LLMs to extract entities and relations, thereby generating a knowledge graph (KG) efficiently. Based on the constructed KG, we introduce a multi-path subgraph construction strategy that incorporates one-hop relations, multi-hop relations, and importance-based relations, aiming to improve the semantic relevance between the retrieved documents and the user query. Subsequently, we designed a query-aware attention reward model that scores subgraph triples based on their semantic relevance to the query. Then, we select the highest score subgraph and enrich subgraph with additional triples from other subgraphs that are highly semantically relevant to the query. Finally, the entities, relations, and triples within the updated subgraph are utilised to expand the original query, thereby enhancing its semantic representation and improving the quality of LLMs' generation. We evaluate QMKGF on the SQuAD, IIRC, Culture, HotpotQA, and MuSiQue datasets. On the HotpotQA dataset, our method achieves a ROUGE-1 score of 64.98\%, surpassing the BGE-Rerank approach by 9.72 percentage points (from 55.26\% to 64.98\%). Experimental results demonstrate the effectiveness and superiority of the QMKGF approach.
CLMay 14, 2025
A Data Synthesis Method Driven by Large Language Models for Proactive Mining of Implicit User Intentions in TourismJinqiang Wang, Huansheng Ning, Tao Zhu et al.
In the tourism domain, Large Language Models (LLMs) often struggle to mine implicit user intentions from tourists' ambiguous inquiries and lack the capacity to proactively guide users toward clarifying their needs. A critical bottleneck is the scarcity of high-quality training datasets that facilitate proactive questioning and implicit intention mining. While recent advances leverage LLM-driven data synthesis to generate such datasets and transfer specialized knowledge to downstream models, existing approaches suffer from several shortcomings: (1) lack of adaptation to the tourism domain, (2) skewed distributions of detail levels in initial inquiries, (3) contextual redundancy in the implicit intention mining module, and (4) lack of explicit thinking about tourists' emotions and intention values. Therefore, we propose SynPT (A Data Synthesis Method Driven by LLMs for Proactive Mining of Implicit User Intentions in the Tourism), which constructs an LLM-driven user agent and assistant agent to simulate dialogues based on seed data collected from Chinese tourism websites. This approach addresses the aforementioned limitations and generates SynPT-Dialog, a training dataset containing explicit reasoning. The dataset is utilized to fine-tune a general LLM, enabling it to proactively mine implicit user intentions. Experimental evaluations, conducted from both human and LLM perspectives, demonstrate the superiority of SynPT compared to existing methods. Furthermore, we analyze key hyperparameters and present case studies to illustrate the practical applicability of our method, including discussions on its adaptability to English-language scenarios. All code and data are publicly available.
CRMay 29, 2023
Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future RecommendationsAttia Qammar, Hongmei Wang, Jianguo Ding et al.
Chatbots shifted from rule-based to artificial intelligence techniques and gained traction in medicine, shopping, customer services, food delivery, education, and research. OpenAI developed ChatGPT blizzard on the Internet as it crossed one million users within five days of its launch. However, with the enhanced popularity, chatbots experienced cybersecurity threats and vulnerabilities. This paper discussed the relevant literature, reports, and explanatory incident attacks generated against chatbots. Our initial point is to explore the timeline of chatbots from ELIZA (an early natural language processing computer program) to GPT-4 and provide the working mechanism of ChatGPT. Subsequently, we explored the cybersecurity attacks and vulnerabilities in chatbots. Besides, we investigated the ChatGPT, specifically in the context of creating the malware code, phishing emails, undetectable zero-day attacks, and generation of macros and LOLBINs. Furthermore, the history of cyberattacks and vulnerabilities exploited by cybercriminals are discussed, particularly considering the risk and vulnerabilities in ChatGPT. Addressing these threats and vulnerabilities requires specific strategies and measures to reduce the harmful consequences. Therefore, the future directions to address the challenges were presented.
HCJan 15, 2022
A Review on Serious Games for PhobiaSha Li, Peichen Yang, Rongyang Li et al.
Phobia is a widespread mental illness, and severe phobias can seriously impact patients daily lives. One-session Exposure Treatment (OST) has been used to treat phobias in the early days,but it has many disadvantages. As a new way to treat a phobia, virtual reality exposure therapy(VRET) based on serious games is introduced. There have been much researches in the field of serious games for phobia therapy (SGPT), so this paper presents a detailed review of SGPT from three perspectives. First, SGPT in different stages has different forms with the update and iteration of technology. Therefore, we reviewed the development history of SGPT from the perspective of equipment. Secondly, there is no unified classification framework for a large number of SGPT. So we classified and combed SGPT according to different types of phobias. Finally, most articles on SGPT have studied the therapeutic effects of serious games from a medical perspective, and few have studied serious games from a technical perspective. Therefore, we conducted in-depth research on SGPT from a technical perspective in order to provide technical guidance for the development of SGPT. Accordingly, the challenges facing the existing technology has been explored and listed.
CYJan 6, 2022
A Review on Serious Games in E-learningHuansheng Ning, Hang Wang, Wenxi Wang et al.
E-learning is a widely used learning method, but with the development of society, traditional E-learning method has exposed some shortcomings, such as the boring way of teaching, so that it is difficult to increase the enthusiasm of students and raise their attention in class. The application of serious games in E-learning can make up for these shortcomings and effectively improve the quality of teaching. When applying serious games to E-learning, there are two main considerations: educational goals and game design. A successful serious game should organically combine the two aspects and balance the educational and entertaining nature of serious games. This paper mainly discusses the role of serious games in E-learning, various elements of game design, the classification of the educational goals of serious games and the relationship between educational goals and game design. In addition, we try to classify serious games and match educational goals with game types to provide guidance and assistance in the design of serious games. This paper also summarizes some shortcomings that serious games may have in the application of E-learning.
SIApr 30, 2013
Challenges on Probabilistic Modeling for Evolving NetworksJianguo Ding, Pascal Bouvry
With the emerging of new networks, such as wireless sensor networks, vehicle networks, P2P networks, cloud computing, mobile Internet, or social networks, the network dynamics and complexity expands from system design, hardware, software, protocols, structures, integration, evolution, application, even to business goals. Thus the dynamics and uncertainty are unavoidable characteristics, which come from the regular network evolution and unexpected hardware defects, unavoidable software errors, incomplete management information and dependency relationship between the entities among the emerging complex networks. Due to the complexity of emerging networks, it is not always possible to build precise models in modeling and optimization (local and global) for networks. This paper presents a survey on probabilistic modeling for evolving networks and identifies the new challenges which emerge on the probabilistic models and optimization strategies in the potential application areas of network performance, network management and network security for evolving networks.