LGJun 20, 2022
Autoencoder-based Attribute Noise Handling Method for Medical DataThomas Ranvier, Haytham Elgazel, Emmanuel Coquery et al.
Medical datasets are particularly subject to attribute noise, that is, missing and erroneous values. Attribute noise is known to be largely detrimental to learning performances. To maximize future learning performances it is primordial to deal with attribute noise before any inference. We propose a simple autoencoder-based preprocessing method that can correct mixed-type tabular data corrupted by attribute noise. No other method currently exists to handle attribute noise in tabular data. We experimentally demonstrate that our method outperforms both state-of-the-art imputation methods and noise correction methods on several real-world medical datasets.
28.5DBMar 20
Condensed Representation for Snapshot-Based RDF GraphsJey Puget Gil, Emmanuel Coquery, John Samuel et al.
Evolving phenomena, often complex, can be represented using knowledge graphs, which have the capability to model heterogeneous data from multiple sources. Nowadays, a considerable amount of sources delivering periodic updates to knowledge graphs in various domains is openly available. The evolution of data is of interest to knowledge graph management systems, and therefore it is crucial to organize these constantly evolving data to make them easily accessible and exploitable for analysis. In this article, we will present and formalize the condensed representation of these evolving graphs and propose a new solution called QuaQue that allows querying across multiple versions of graphs and we also present the results of our benchmark comparing our solution against existing approaches.
19.4DBMar 19
QuaQue: Design and SQL Implementation of Condensed Algebra for Concurrent Versioning of Knowledge GraphsJey Puget Gil, Emmanuel Coquery, John Samuel et al.
The management of versioned knowledge graphs presents significant challenges, particularly in querying data across multiple versions efficiently. This paper introduces QuaQue, a key component of the ConVer-G system, which addresses this challenge by translating SPARQL (SPARQL Protocol and RDF Query Language) queries into SQL (Structured Query Language). QuaQue leverages a novel condensed algebra to operate on a relational model where versioning information is compactly stored using bitstrings. This approach allows for efficient querying of concurrent versions of knowledge graphs within a standard relational database system. We present the key concepts of our condensed algebra, detail the translation process from SPARQL algebra to SQL, and provide a comparative benchmark against a native RDF (Resource Description Framework) triple store, demonstrating the viability and performance benefits of our approach.