Javier D. Fernández

h-index24

3papers

90citations

Novelty62%

AI Score30

Ranked #137,328 of 194,257 authors (top 71%)#24,663 in CL (top 80%)

3 Papers

7.2CRJan 26, 2020

The SPECIAL-K Personal Data Processing Transparency and Compliance Platform

Sabrina Kirrane, Javier D. Fernández, Piero Bonatti et al.

The European General Data Protection Regulation (GDPR) brings new challenges for companies who must ensure they have an appropriate legal basis for processing personal data and must provide transparency with respect to personal data processing and sharing within and between organisations. Additionally, when it comes to consent as a legal basis, companies need to ensure that they comply with usage constraints specified by data subjects. This paper presents the policy language and supporting ontologies and vocabularies, developed within the SPECIAL EU H2020 project, which can be used to represent data usage policies and data processing and sharing events. We introduce a concrete transparency and compliance architecture, referred to as SPECIAL-K, that can be used to automatically verify that data processing and sharing complies with the data subjects consent. Our evaluation, based on a new compliance benchmark, shows the efficiency and scalability of the system with increasing number of events and users.

4.9CLAug 19, 2019Code

Message Passing for Complex Question Answering over Knowledge Graphs

Svitlana Vakulenko, Javier David Fernandez Garcia, Axel Polleres et al.

Question answering over knowledge graphs (KGQA) has evolved from simple single-fact questions to complex questions that require graph traversal and aggregation. We propose a novel approach for complex KGQA that uses unsupervised message passing, which propagates confidence scores obtained by parsing an input question and matching terms in the knowledge graph to a set of possible answers. First, we identify entity, relationship, and class names mentioned in a natural language question, and map these to their counterparts in the graph. Then, the confidence scores of these mappings propagate through the graph structure to locate the answer entities. Finally, these are aggregated depending on the identified question type. This approach can be efficiently implemented as a series of sparse matrix multiplications mimicking joins over small local subgraphs. Our evaluation results show that the proposed approach outperforms the state-of-the-art on the LC-QuAD benchmark. Moreover, we show that the performance of the approach depends only on the quality of the question interpretation results, i.e., given a correct relevance score distribution, our approach always produces a correct answer ranking. Our error analysis reveals correct answers missing from the benchmark dataset and inconsistencies in the DBpedia knowledge graph. Finally, we provide a comprehensive evaluation of the proposed approach accompanied with an ablation study and an error analysis, which showcase the pitfalls for each of the question answering components in more detail.

1.2DBOct 18, 2013

Compressed Vertical Partitioning for Full-In-Memory RDF Management

Sandra Álvarez-García, Nieves R. Brisaboa, Javier D. Fernández et al.

The Web of Data has been gaining momentum and this leads to increasingly publish more semi-structured datasets following the RDF model, based on atomic triple units of subject, predicate, and object. Although it is a simple model, compression methods become necessary because datasets are increasingly larger and various scalability issues arise around their organization and storage. This requirement is more restrictive in RDF stores because efficient SPARQL resolution on the compressed RDF datasets is also required. This article introduces a novel RDF indexing technique (called k2-triples) supporting efficient SPARQL resolution in compressed space. k2-triples, uses the predicate to vertically partition the dataset into disjoint subsets of pairs (subject, object), one per predicate. These subsets are represented as binary matrices in which 1-bits mean that the corresponding triple exists in the dataset. This model results in very sparse matrices, which are efficiently compressed using k2-trees. We enhance this model with two compact indexes listing the predicates related to each different subject and object, in order to address the specific weaknesses of vertically partitioned representations. The resulting technique not only achieves by far the most compressed representations, but also the best overall performance for RDF retrieval in our experiments. Our approach uses up to 10 times less space than a state of the art baseline, and outperforms its performance by several order of magnitude on the most basic query patterns. In addition, we optimize traditional join algorithms on k2-triples and define a novel one leveraging its specific features. Our experimental results show that our technique overcomes traditional vertical partitioning for join resolution, reporting the best numbers for joins in which the non-joined nodes are provided, and being competitive in the majority of the cases.