Christophe Cruz

AI
h-index26
16papers
145citations
Novelty25%
AI Score23

16 Papers

IRJun 12, 2023
Imbalanced Multi-label Classification for Business-related Text with Moderately Large Label Spaces

Muhammad Arslan, Christophe Cruz

In this study, we compared the performance of four different methods for multi label text classification using a specific imbalanced business dataset. The four methods we evaluated were fine tuned BERT, Binary Relevance, Classifier Chains, and Label Powerset. The results show that fine tuned BERT outperforms the other three methods by a significant margin, achieving high values of accuracy, F1 Score, Precision, and Recall. Binary Relevance also performs well on this dataset, while Classifier Chains and Label Powerset demonstrate relatively poor performance. These findings highlight the effectiveness of fine tuned BERT for multi label text classification tasks, and suggest that it may be a useful tool for businesses seeking to analyze complex and multifaceted texts.

AIJul 4, 2023
Knowledge Graph for NLG in the context of conversational agents

Hussam Ghanem, Massinissa Atmani, Christophe Cruz

The use of knowledge graphs (KGs) enhances the accuracy and comprehensiveness of the responses provided by a conversational agent. While generating answers during conversations consists in generating text from these KGs, it is still regarded as a challenging task that has gained significant attention in recent years. In this document, we provide a review of different architectures used for knowledge graph-to-text generation including: Graph Neural Networks, the Graph Transformer, and linearization with seq2seq models. We discuss the advantages and limitations of each architecture and conclude that the choice of architecture will depend on the specific requirements of the task at hand. We also highlight the importance of considering constraints such as execution time and model validity, particularly in the context of conversational agents. Based on these constraints and the availability of labeled data for the domains of DAVI, we choose to use seq2seq Transformer-based models (PLMs) for the Knowledge Graph-to-Text Generation task. We aim to refine benchmark datasets of kg-to-text generation on PLMs and to explore the emotional and multilingual dimensions in our future work. Overall, this review provides insights into the different approaches for knowledge graph-to-text generation and outlines future directions for research in this area.

AIFeb 16, 2025
Unlocking the Potential of Generative AI through Neuro-Symbolic Architectures: Benefits and Limitations

Oualid Bougzime, Samir Jabbar, Christophe Cruz et al.

Neuro-symbolic artificial intelligence (NSAI) represents a transformative approach in artificial intelligence (AI) by combining deep learning's ability to handle large-scale and unstructured data with the structured reasoning of symbolic methods. By leveraging their complementary strengths, NSAI enhances generalization, reasoning, and scalability while addressing key challenges such as transparency and data efficiency. This paper systematically studies diverse NSAI architectures, highlighting their unique approaches to integrating neural and symbolic components. It examines the alignment of contemporary AI techniques such as retrieval-augmented generation, graph neural networks, reinforcement learning, and multi-agent systems with NSAI paradigms. This study then evaluates these architectures against comprehensive set of criteria, including generalization, reasoning capabilities, transferability, and interpretability, therefore providing a comparative analysis of their respective strengths and limitations. Notably, the Neuro > Symbolic < Neuro model consistently outperforms its counterparts across all evaluation metrics. This result aligns with state-of-the-art research that highlight the efficacy of such architectures in harnessing advanced technologies like multi-agent systems.

IRJan 6, 2025
Political Events using RAG with LLMs

Muhammad Arslan, Saba Munawar, Christophe Cruz

In the contemporary digital landscape, media content stands as the foundation for political news analysis, offering invaluable insights sourced from various channels like news articles, social media updates, speeches, and reports. Natural Language Processing (NLP) has revolutionized Political Information Extraction (IE), automating tasks such as Event Extraction (EE) from these diverse media outlets. While traditional NLP methods often necessitate specialized expertise to build rule-based systems or train machine learning models with domain-specific datasets, the emergence of Large Language Models (LLMs) driven by Generative Artificial Intelligence (GenAI) presents a promising alternative. These models offer accessibility, alleviating challenges associated with model construction from scratch and reducing the dependency on extensive datasets during the training phase, thus facilitating rapid implementation. However, challenges persist in handling domain-specific tasks, leading to the development of the Retrieval-Augmented Generation (RAG) framework. RAG enhances LLMs by integrating external data retrieval, enriching their contextual understanding, and expanding their knowledge base beyond pre-existing training data. To illustrate RAG's efficacy, we introduce the Political EE system, specifically tailored to extract political event information from news articles. Understanding these political insights is essential for remaining informed about the latest political advancements, whether on a national or global scale.

IRJan 6, 2025
Sustainable Digitalization of Business with Multi-Agent RAG and LLM

Muhammad Arslan, Saba Munawar, Christophe Cruz

Businesses heavily rely on data sourced from various channels like news articles, financial reports, and consumer reviews to drive their operations, enabling informed decision-making and identifying opportunities. However, traditional manual methods for data extraction are often time-consuming and resource-intensive, prompting the adoption of digital transformation initiatives to enhance efficiency. Yet, concerns persist regarding the sustainability of such initiatives and their alignment with the United Nations (UN)'s Sustainable Development Goals (SDGs). This research aims to explore the integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) as a sustainable solution for Information Extraction (IE) and processing. The research methodology involves reviewing existing solutions for business decision-making, noting that many systems require training new machine learning models, which are resource-intensive and have significant environmental impacts. Instead, we propose a sustainable business solution using pre-existing LLMs that can work with diverse datasets. We link domain-specific datasets to tailor LLMs to company needs and employ a Multi-Agent architecture to divide tasks such as information retrieval, enrichment, and classification among specialized agents. This approach optimizes the extraction process and improves overall efficiency. Through the utilization of these technologies, businesses can optimize resource utilization, improve decision-making processes, and contribute to sustainable development goals, thereby fostering environmental responsibility within the corporate sector.

CLFeb 7, 2025
Enhancing Knowledge Graph Construction: Evaluating with Emphasis on Hallucination, Omission, and Graph Similarity Metrics

Hussam Ghanem, Christophe Cruz

Recent advancements in large language models have demonstrated significant potential in the automated construction of knowledge graphs from unstructured text. This paper builds upon our previous work [16], which evaluated various models using metrics like precision, recall, F1 score, triple matching, and graph matching, and introduces a refined approach to address the critical issues of hallucination and omission. We propose an enhanced evaluation framework incorporating BERTScore for graph similarity, setting a practical threshold of 95% for graph matching. Our experiments focus on the Mistral model, comparing its original and fine-tuned versions in zero-shot and few-shot settings. We further extend our experiments using examples from the KELM-sub training dataset, illustrating that the fine-tuned model significantly improves knowledge graph construction accuracy while reducing the exact hallucination and omission. However, our findings also reveal that the fine-tuned models perform worse in generalization tasks on the KELM-sub dataset. This study underscores the importance of comprehensive evaluation metrics in advancing the state-of-the-art in knowledge graph construction from textual data.

AIDec 2, 2014
Semantic HMC for Big Data Analysis

Thomas Hassan, Rafael Peixoto, Christophe Cruz et al.

Analyzing Big Data can help corporations to im-prove their efficiency. In this work we present a new vision to derive Value from Big Data using a Semantic Hierarchical Multi-label Classification called Semantic HMC based in a non-supervised Ontology learning process. We also proposea Semantic HMC process, using scalable Machine-Learning techniques and Rule-based reasoning.

AIJan 21, 2013
From 9-IM Topological Operators to Qualitative Spatial Relations using 3D Selective Nef Complexes and Logic Rules for bodies

Helmi Ben Hmida, Christophe Cruz, Frank Boochs et al.

This paper presents a method to compute automatically topological relations using SWRL rules. The calculation of these rules is based on the definition of a Selective Nef Complexes Nef Polyhedra structure generated from standard Polyhedron. The Selective Nef Complexes is a data model providing a set of binary Boolean operators such as Union, Difference, Intersection and Symmetric difference, and unary operators such as Interior, Closure and Boundary. In this work, these operators are used to compute topological relations between objects defined by the constraints of the 9 Intersection Model (9-IM) from Egenhofer. With the help of these constraints, we defined a procedure to compute the topological relations on Nef polyhedra. These topological relationships are Disjoint, Meets, Contains, Inside, Covers, CoveredBy, Equals and Overlaps, and defined in a top-level ontology with a specific semantic definition on relation such as Transitive, Symmetric, Asymmetric, Functional, Reflexive, and Irreflexive. The results of the computation of topological relationships are stored in an OWL-DL ontology allowing after what to infer on these new relationships between objects. In addition, logic rules based on the Semantic Web Rule Language allows the definition of logic programs that define which topological relationships have to be computed on which kind of objects with specific attributes. For instance, a "Building" that overlaps a "Railway" is a "RailStation".

CGJan 21, 2013
Integration of knowledge to support automatic object reconstruction from images and 3D data

Frank Boochs, Andreas Marbs, Hung Truong et al.

Object reconstruction is an important task in many fields of application as it allows to generate digital representations of our physical world used as base for analysis, planning, construction, visualization or other aims. A reconstruction itself normally is based on reliable data (images, 3D point clouds for example) expressing the object in his complete extent. This data then has to be compiled and analyzed in order to extract all necessary geometrical elements, which represent the object and form a digital copy of it. Traditional strategies are largely based on manual interaction and interpretation, because with increasing complexity of objects human understanding is inevitable to achieve acceptable and reliable results. But human interaction is time consuming and expensive, why many researches has already been invested to use algorithmic support, what allows to speed up the process and to reduce manual work load. Presently most of such supporting algorithms are data-driven and concentate on specific features of the objects, being accessible to numerical models. By means of these models, which normally will represent geometrical (flatness, roughness, for example) or physical features (color, texture), the data is classified and analyzed. This is successful for objects with low complexity, but gets to its limits with increasing complexness of objects. Then purely numerical strategies are not able to sufficiently model the reality. Therefore, the intention of our approach is to take human cognitive strategy as an example, and to simulate extraction processes based on available human defined knowledge for the objects of interest. Such processes will introduce a semantic structure for the objects and guide the algorithms used to detect and recognize objects, which will yield a higher effectiveness. Hence, our research proposes an approach using knowledge to guide the algorithms in 3D point cloud and image processing.

AIJan 21, 2013
Knowledge Base Approach for 3D Objects Detection in Point Clouds Using 3D Processing and Specialists Knowledge

Helmi Ben Hmida, Christophe Cruz, Frank Boochs et al.

This paper presents a knowledge-based detection of objects approach using the OWL ontology language, the Semantic Web Rule Language, and 3D processing built-ins aiming at combining geometrical analysis of 3D point clouds and specialist's knowledge. Here, we share our experience regarding the creation of 3D semantic facility model out of unorganized 3D point clouds. Thus, a knowledge-based detection approach of objects using the OWL ontology language is presented. This knowledge is used to define SWRL detection rules. In addition, the combination of 3D processing built-ins and topological Built-Ins in SWRL rules allows a more flexible and intelligent detection, and the annotation of objects contained in 3D point clouds. The created WiDOP prototype takes a set of 3D point clouds as input, and produces as output a populated ontology corresponding to an indexed scene visualized within VRML language. The context of the study is the detection of railway objects materialized within the Deutsche Bahn scene such as signals, technical cupboards, electric poles, etc. Thus, the resulting enriched and populated ontology, that contains the annotations of objects in the point clouds, is used to feed a GIS system or an IFC file for architecture purposes.

CGJan 21, 2013
Toward the Automatic Generation of a Semantic VRML Model from Unorganized 3D Point Clouds

Helmi Ben Hmida, Christophe Cruz, Christophe Nicolle et al.

This paper presents our experience regarding the creation of 3D semantic facility model out of unorganized 3D point clouds. Thus, a knowledge-based detection approach of objects using the OWL ontology language is presented. This knowledge is used to define SWRL detection rules. In addition, the combination of 3D processing built-ins and topological Built-Ins in SWRL rules aims at combining geometrical analysis of 3D point clouds and specialist's knowledge. This combination allows more flexible and intelligent detection and the annotation of objects contained in 3D point clouds. The created WiDOP prototype takes a set of 3D point clouds as input, and produces an indexed scene of colored objects visualized within VRML language as output. The context of the study is the detection of railway objects materialized within the Deutsche Bahn scene such as signals, technical cupboards, electric poles, etc. Therefore, the resulting enriched and populated domain ontology, that contains the annotations of objects in the point clouds, is used to feed a GIS system.

CGJan 21, 2013
From 3D Point Clouds To Semantic Objects An Ontology-Based Detection Approach

Helmi Ben Hmida, Christophe Cruz, Frank Boochs et al.

This paper presents a knowledge-based detection of objects approach using the OWL ontology language, the Semantic Web Rule Language, and 3D processing built-ins aiming at combining geometrical analysis of 3D point clouds and specialist's knowledge. This combination allows the detection and the annotation of objects contained in point clouds. The context of the study is the detection of railway objects such as signals, technical cupboards, electric poles, etc. Thus, the resulting enriched and populated ontology, that contains the annotations of objects in the point clouds, is used to feed a GIS systems or an IFC file for architecture purposes.

IRJan 21, 2013
Ontology-based Recommender System of Economic Articles

David Werner, Christophe Cruz, Christophe Nicolle

Decision makers need economical information to drive their decisions. The Company Actualis SARL is specialized in the production and distribution of a press review about French regional economic actors. This economic review represents for a client a prospecting tool on partners and competitors. To reduce the overload of useless information, the company is moving towards a customized review for each customer. Three issues appear to achieve this goal. First, how to identify the elements in the text in order to extract objects that match with the recommendation's criteria presented? Second, How to define the structure of these objects, relationships and articles in order to provide a source of knowledge usable by the extraction process to produce new knowledge from articles? The latter issue is the feedback on customer experience to identify the quality of distributed information in real-time and to improve the relevance of the recommendations. This paper presents a new type of recommendation based on the semantic description of both articles and user profile.

CGJan 21, 2013
From Quantitative Spatial Operator to Qualitative Spatial Relation Using Constructive Solid Geometry, Logic Rules and Optimized 9-IM Model, A Semantic Based Approach

Helmi Ben Hmida, Christophe Cruz, Frank Boochs et al.

The Constructive Solid Geometry (CSG) is a data model providing a set of binary Boolean operators such as Union, Difference and Intersection. In this work, these operators are used to compute topological relations between objects defined by the constraints of the nine Intersection Model (9-IM) from Egenhofer. With the help of these constraints, we define a procedure to compute the topological relations on CSG objects. These topological relations are Disjoint, Contains, Inside, Covers, CoveredBy, Equals and Overlaps, and are defined in a top-level ontology with a specific semantic definition on relation such as Transitive, Symmetric, Asymmetric, Functional, Reflexive, and Irreflexive. The results of topological relations computation are stored in the ontology allowing after what to infer on these topological relationships. In addition, logic rules based on the Semantic Web Language allows the definition of logic programs that define which topological relationships have to be computed on which kind of objects. For instance, a "Building" that overlaps a "Railway" is a "RailStation".

SEAug 8, 2012
Guidelines for a Dynamic Ontology - Integrating Tools of Evolution and Versioning in Ontology

Perrine Pittet, Christophe Nicolle, Christophe Cruz

Ontologies are built on systems that conceptually evolve over time. In addition, techniques and languages for building ontologies evolve too. This has led to numerous studies in the field of ontology versioning and ontology evolution. This paper presents a new way to manage the lifecycle of an ontology incorporating both versioning tools and evolution process. This solution, called VersionGraph, is integrated in the source ontology since its creation in order to make it possible to evolve and to be versioned. Change management is strongly related to the model in which the ontology is represented. Therefore, we focus on the OWL language in order to take into account the impact of the changes on the logical consistency of the ontology like specified in OWL DL.