CLCYMar 14, 2019

Interactive Concept Mining on Personal Data -- Bootstrapping Semantic Services

arXiv:1903.05872v15 citations
Originality Incremental advance
AI Analysis

This addresses the issue of underperformance and clutter in semantic desktops for users managing personal information, but it is incremental as it builds on existing extraction tools.

The paper tackles the cold start problem in semantic services by proposing an interactive concept mining approach that extracts higher-level concepts from personal data, using user feedback to filter candidates and demonstrating a prototype.

Semantic services (e.g. Semantic Desktops) are still afflicted by a cold start problem: in the beginning, the user's personal information sphere, i.e. files, mails, bookmarks, etc., is not represented by the system. Information extraction tools used to kick-start the system typically create 1:1 representations of the different information items. Higher level concepts, for example found in file names, mail subjects or in the content body of these items, are not extracted. Leaving these concepts out may lead to underperformance, having to many of them (e.g. by making every found term a concept) will clutter the arising knowledge graph with non-helpful relations. In this paper, we present an interactive concept mining approach proposing concept candidates gathered by exploiting given schemata of usual personal information management applications and analysing the personal information sphere using various metrics. To heed the subjective view of the user, a graphical user interface allows to easily rank and give feedback on proposed concept candidates, thus keeping only those actually considered relevant. A prototypical implementation demonstrates major steps of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes