François Denis

LG
3papers
28citations
Novelty40%
AI Score39

3 Papers

59.6CLMar 25Code
Chronological Knowledge Retrieval: A Retrieval-Augmented Generation Approach to Construction Project Documentation

Ioannis-Aris Kostis, Natalia Sanchiz, Steeve De Schryver et al.

In large-scale construction projects, the continuous evolution of decisions generates extensive records, most often captured in meeting minutes. Since decisions may override previous ones, professionals often need to reconstruct the history of specific choices. Retrieving such information manually from raw archives is both labor-intensive and error-prone. From a user perspective, we address this challenge by enabling conversational access to the whole set of project meeting minutes. Professionals can pose natural-language questions and receive answers that are both semantically relevant and explicitly time-annotated, allowing them to follow the chronology of decisions. From a technical perspective, our solution employs a Retrieval-Augmented Generation (RAG) framework that integrates semantic search with large language models to ensure accurate and context-aware responses. We demonstrate the approach using an anonymized, industry-sourced dataset of meeting minutes from a completed construction project by a large company in Belgium. The dataset is annotated and enriched with expert-defined queries to support systematic evaluation. Both the dataset and the open-source implementation are made available to the community to foster further research on conversational access to time-annotated project documentation.

LGMar 17, 2014
Learning Negative Mixture Models by Tensor Decompositions

Guillaume Rabusseau, François Denis

This work considers the problem of estimating the parameters of negative mixture models, i.e. mixture models that possibly involve negative weights. The contributions of this paper are as follows. (i) We show that every rational probability distributions on strings, a representation which occurs naturally in spectral learning, can be computed by a negative mixture of at most two probabilistic automata (or HMMs). (ii) We propose a method to estimate the parameters of negative mixture models having a specific tensor structure in their low order observable moments. Building upon a recent paper on tensor decompositions for learning latent variable models, we extend this work to the broader setting of tensors having a symmetric decomposition with positive and negative weights. We introduce a generalization of the tensor power method for complex valued tensors, and establish theoretical convergence guarantees. (iii) We show how our approach applies to negative Gaussian mixture models, for which we provide some experiments.

LGDec 21, 2013
Dimension-free Concentration Bounds on Hankel Matrices for Spectral Learning

François Denis, Mattias Gybels, Amaury Habrard

Learning probabilistic models over strings is an important issue for many applications. Spectral methods propose elegant solutions to the problem of inferring weighted automata from finite samples of variable-length strings drawn from an unknown target distribution. These methods rely on a singular value decomposition of a matrix $H_S$, called the Hankel matrix, that records the frequencies of (some of) the observed strings. The accuracy of the learned distribution depends both on the quantity of information embedded in $H_S$ and on the distance between $H_S$ and its mean $H_r$. Existing concentration bounds seem to indicate that the concentration over $H_r$ gets looser with the size of $H_r$, suggesting to make a trade-off between the quantity of used information and the size of $H_r$. We propose new dimension-free concentration bounds for several variants of Hankel matrices. Experiments demonstrate that these bounds are tight and that they significantly improve existing bounds. These results suggest that the concentration rate of the Hankel matrix around its mean does not constitute an argument for limiting its size.