CLFeb 17, 2023
Complex QA and language models hybrid architectures, SurveyXavier Daull, Patrice Bellot, Emmanuel Bruno et al.
This paper reviews the state-of-the-art of large language models (LLM) architectures and strategies for "complex" question-answering with a focus on hybrid architectures. LLM based chatbot services have allowed anyone to grasp the potential of LLM to solve many common problems, but soon discovered their limitations for complex questions. Addressing more specific, complex questions (e.g., "What is the best mix of power-generation methods to reduce climate change ?") often requires specialized architectures, domain knowledge, new skills, decomposition and multi-step resolution, deep reasoning, sensitive data protection, explainability, and human-in-the-loop processes. Therefore, we review: (1) necessary skills and tasks for handling complex questions and common LLM limits to overcome; (2) dataset, cost functions and evaluation metrics for measuring and improving (e.g. accuracy, explainability, fairness, robustness, groundedness, faithfulness, toxicity...); (3) family of solutions to overcome LLM limitations by (a) training and reinforcement (b) hybridization, (c) prompting, (d) agentic-architectures (agents, tools) and extended reasoning.
AIMay 3, 2022
GRAPHYP: A Scientific Knowledge Graph with Manifold Subnetworks of Communities. Detection of Scholarly Disputes in Adversarial Information RoutesRenaud Fabre, Otmane Azeroual, Patrice Bellot et al.
The cognitive manifold of published content is currently expanding in all areas of science. However, Scientific Knowledge Graphs (SKGs) only provide poor pictures of the adversarial directions and scientific controversies that feed the production of knowledge. In this Article, we tackle the understanding of the design of the information space of a cognitive representation of research activities, and of related bottlenecks that affect search interfaces, in the mapping of structured objects into graphs. We propose, with SKG GRAPHYP, a novel graph designed geometric architecture which optimizes both the detection of the knowledge manifold of "cognitive communities", and the representation of alternative paths to adversarial answers to a research question, for instance in the context of academic disputes. With a methodology for designing "Manifold Subnetworks of Cognitive Communities", GRAPHYP provides a classification of distinct search paths in a research field. Users are detected from the variety of their search practices and classified in "Cognitive communities" from the analysis of the search history of their logs of scientific documentation. The manifold of practices is expressed from metrics of differentiated uses by triplets of nodes shaped into symmetrical graph subnetworks, with the following three parameters: Mass, Intensity, and Variety.
AIDec 1, 2025
A Flexible Multi-Agent LLM-Human Framework for Fast Human Validated Tool BuildingDaull Xavier, Patrice Bellot, Emmanuel Bruno et al.
We introduce CollabToolBuilder, a flexible multiagent LLM framework with expert-in-the-loop (HITL) guidance that iteratively learns to create tools for a target goal, aligning with human intent and process, while minimizing time for task/domain adaptation effort and human feedback capture. The architecture generates and validates tools via four specialized agents (Coach, Coder, Critic, Capitalizer) using a reinforced dynamic prompt and systematic human feedback integration to reinforce each agent's role toward goals and constraints. This work is best viewed as a system-level integration and methodology combining multi-agent in-context learning, HITL controls, and reusable tool capitalization for complex iterative problems such as scientific document generation. We illustrate it with preliminary experiments (e.g., generating state-of-the-art research papers or patents given an abstract) and discuss its applicability to other iterative problem-solving.
AIJul 26, 2025
Matching Game Preferences Through Dialogical Large Language Models: A PerspectiveRenaud Fabre, Daniel Egret, Patrice Bellot
This perspective paper explores the future potential of "conversational intelligence" by examining how Large Language Models (LLMs) could be combined with GRAPHYP's network system to better understand human conversations and preferences. Using recent research and case studies, we propose a conceptual framework that could make AI rea-soning transparent and traceable, allowing humans to see and understand how AI reaches its conclusions. We present the conceptual perspective of "Matching Game Preferences through Dialogical Large Language Models (D-LLMs)," a proposed system that would allow multiple users to share their different preferences through structured conversations. This approach envisions personalizing LLMs by embedding individual user preferences directly into how the model makes decisions. The proposed D-LLM framework would require three main components: (1) reasoning processes that could analyze different search experiences and guide performance, (2) classification systems that would identify user preference patterns, and (3) dialogue approaches that could help humans resolve conflicting information. This perspective framework aims to create an interpretable AI system where users could examine, understand, and combine the different human preferences that influence AI responses, detected through GRAPHYP's search experience networks. The goal of this perspective is to envision AI systems that would not only provide answers but also show users how those answers were reached, making artificial intelligence more transparent and trustworthy for human decision-making.
IRJan 22, 2020
Emotion and Sentiment Lexicon Impact on Sentiment Analysis Applied to Book ReviewsPatrice Bellot, Lerch Soëlie, Bruno Emmanuel et al.
Consumers are used to consulting posted reviews on the Internet before buying a product. But it's difficult to know the global opinion considering the important number of those reviews. Sentiment analysis afford detecting polarity (positive, negative, neutral) in a expressed opinion and therefore classifying those reviews. Our purpose is to determine the influence of emotions on the polarity of books reviews. We define "bag-of-words" representation models of reviews which use a lexicon containing emotional (anticipation, sadness, fear, anger, joy, surprise, trust, disgust) and sentimental (positive, negative) words. This lexicon afford measuring felt emotions types by readers. The implemented supervised learning used is a Random Forest type. The application concerns Amazon platform's reviews. Mots-cl{é}s : Analyse de sentiments, Analyse d'{é}motions (texte), Classification de polarit{é} de sentiments
IRMay 23, 2018
A logical representation of Arabic questions toward automatic passage extraction from the WebPatrice Bellot, Wided Bakari, Mahmoud Neji
With the expanding growth of Arabic electronic data on the web, extracting information, which is actually one of the major challenges of the question-answering, is essentially used for building corpus of documents. In fact, building a corpus is a research topic that is currently referred to among some other major themes of conferences, in Natural Language Processing (NLP), such as, Information Retrieval (IR), Question-Answering (QA), Automatic Summary (AS), etc. Generally, a question-answering system provides various passages to answer the user questions. To make these passages truly informative, this system needs access to an underlying knowledge base; this requires the construction of a corpus. The aim of our research is to build an Arabic question-answering system. In addition, analyzing the question must be the first step. Next, it is essential to retrieve a passage from the web that can serve as an appropriate answer. In this paper, we propose a method to analysis the question and retrieve the passage answer in the Arabic language. For the question analysis, five factual question types are processed. Additionally, our purpose is to experiment with the generation of a logic representation from the declarative form of each question. Several studies, deal with the logic approaches in question-answering, are discussed in other languages than the Arabic language. This representation is very promising because it helps us later in the selection of a justifiable answer. The accuracy of questions that are correctly analyzed and translated into the logic form achieved 64%. And then, the results of passages of texts that are automatically generated achieved an 87% score for accuracy and a 98% score for c@1.
CLSep 27, 2017
A Preliminary Study for Building an Arabic Corpus of Pair Questions-Texts from the Web: AQA-WebcorpWided Bakari, Patrice Bellot, Mahmoud Neji
With the development of electronic media and the heterogeneity of Arabic data on the Web, the idea of building a clean corpus for certain applications of natural language processing, including machine translation, information retrieval, question answer, become more and more pressing. In this manuscript, we seek to create and develop our own corpus of pair's questions-texts. This constitution then will provide a better base for our experimentation step. Thus, we try to model this constitution by a method for Arabic insofar as it recovers texts from the web that could prove to be answers to our factual questions. To do this, we had to develop a java script that can extract from a given query a list of html pages. Then clean these pages to the extent of having a data base of texts and a corpus of pair's question-texts. In addition, we give preliminary results of our proposal method. Some investigations for the construction of Arabic corpus are also presented in this document.
CLOct 10, 2016
Supervised Term Weighting Metrics for Sentiment Analysis in Short TextHussam Hamdan, Patrice Bellot, Frederic Bechet
Term weighting metrics assign weights to terms in order to discriminate the important terms from the less crucial ones. Due to this characteristic, these metrics have attracted growing attention in text classification and recently in sentiment analysis. Using the weights given by such metrics could lead to more accurate document representation which may improve the performance of the classification. While previous studies have focused on proposing or comparing different weighting metrics at two-classes document level sentiment analysis, this study propose to analyse the results given by each metric in order to find out the characteristics of good and bad weighting metrics. Therefore we present an empirical study of fifteen global supervised weighting metrics with four local weighting metrics adopted from information retrieval, we also give an analysis to understand the behavior of each metric by observing and analysing how each metric distributes the terms and deduce some characteristics which may distinguish the good and bad metrics. The evaluation has been done using Support Vector Machine on three different datasets: Twitter, restaurant and laptop reviews.
CLMar 4, 2016
Sentiment Analysis in Scholarly Book ReviewsHussam Hamdan, Patrice Bellot, Frederic Bechet
So far different studies have tackled the sentiment analysis in several domains such as restaurant and movie reviews. But, this problem has not been studied in scholarly book reviews which is different in terms of review style and size. In this paper, we propose to combine different features in order to be presented to a supervised classifiers which extract the opinion target expressions and detect their polarities in scholarly book reviews. We construct a labeled corpus for training and evaluating our methods in French book reviews. We also evaluate them on English restaurant reviews in order to measure their robustness across the domains and languages. The evaluation shows that our methods are enough robust for English restaurant reviews and French book reviews.