IRJul 17, 2020

Augmented Understanding and Automated Adaptation of Curation Rules

arXiv:2007.08710v11.6

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of data curation for analysts and decision-makers, but it appears incremental as it builds on existing methods for automation.

The dissertation tackles the problem of error-prone and tedious data curation tasks by proposing automated techniques and systems to augment analysts, resulting in improved efficiency and reduced manual effort.

Over the past years, there has been many efforts to curate and increase the added value of the raw data. Data curation has been defined as activities and processes an analyst undertakes to transform the raw data into contextualized data and knowledge. Data curation enables decision-makers and data analyst to extract value and derive insight from the raw data. However, to curate the raw data, an analyst needs to carry out various curation tasks including, extraction linking, classification, and indexing, which are error-prone, tedious and challenging. Besides, deriving insight require analysts to spend a long period of time to scan and analyze the curation environments. This problem is exacerbated when the curation environment is large, and the analyst needs to curate a varied and comprehensive list of data. To address these challenges, in this dissertation, we present techniques, algorithms and systems for augmenting analysts in curation tasks. We propose: ~(1) a feature-based and automated technique for curating the raw data. ~(2) We propose an autonomic approach for adapting data curation rules. ~(3) We provide a solution to augment users in formulating their preferences while curating data in large scale information spaces. ~(4) We implement a set of APIs for automating the basic curation tasks, including Named Entity extraction, POS tags, classification, and etc.

View on arXiv PDF

Similar