AIJul 28, 2021

Tab2Know: Building a Knowledge Base from Tables in Scientific Papers

arXiv:2107.13306v110.111 citationsh-index: 22Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of building a knowledge base from scientific tables for researchers and practitioners, but it is incremental as it builds on existing methods for table interpretation.

The authors tackled the problem of automatically extracting knowledge from tables in scientific papers by developing Tab2Know, an end-to-end system that uses classifiers and logic-based reasoning to interpret tables and disambiguate entities, with empirical evaluation showing satisfactory performance.

Tables in scientific papers contain a wealth of valuable knowledge for the scientific enterprise. To help the many of us who frequently consult this type of knowledge, we present Tab2Know, a new end-to-end system to build a Knowledge Base (KB) from tables in scientific papers. Tab2Know addresses the challenge of automatically interpreting the tables in papers and of disambiguating the entities that they contain. To solve these problems, we propose a pipeline that employs both statistical-based classifiers and logic-based reasoning. First, our pipeline applies weakly supervised classifiers to recognize the type of tables and columns, with the help of a data labeling system and an ontology specifically designed for our purpose. Then, logic-based reasoning is used to link equivalent entities (via sameAs links) in different tables. An empirical evaluation of our approach using a corpus of papers in the Computer Science domain has returned satisfactory performance. This suggests that ours is a promising step to create a large-scale KB of scientific knowledge.

View on arXiv PDF Code

Similar