LGJun 20, 2023

SeFNet: Bridging Tabular Datasets with Semantic Feature Nets

arXiv:2306.11636v16 citationsh-index: 35
Originality Incremental advance
AI Analysis

This addresses the issue of limited contextual information in tabular datasets for machine learning practitioners, enabling better meta-learning and knowledge transfer, though it is incremental as it builds on existing ontology-based methods.

The paper tackles the problem of tabular datasets being treated as standalone tasks by proposing Semantic Feature Net (SeFNet) to capture semantic meanings of features using ontologies like SNOMED-CT, enabling a Dataset Ontology-based Semantic Similarity (DOSS) measure to share insights across predictive tasks in healthcare.

Machine learning applications cover a wide range of predictive tasks in which tabular datasets play a significant role. However, although they often address similar problems, tabular datasets are typically treated as standalone tasks. The possibilities of using previously solved problems are limited due to the lack of structured contextual information about their features and the lack of understanding of the relations between them. To overcome this limitation, we propose a new approach called Semantic Feature Net (SeFNet), capturing the semantic meaning of the analyzed tabular features. By leveraging existing ontologies and domain knowledge, SeFNet opens up new opportunities for sharing insights between diverse predictive tasks. One such opportunity is the Dataset Ontology-based Semantic Similarity (DOSS) measure, which quantifies the similarity between datasets using relations across their features. In this paper, we present an example of SeFNet prepared for a collection of predictive tasks in healthcare, with the features' relations derived from the SNOMED-CT ontology. The proposed SeFNet framework and the accompanying DOSS measure address the issue of limited contextual information in tabular datasets. By incorporating domain knowledge and establishing semantic relations between features, we enhance the potential for meta-learning and enable valuable insights to be shared across different predictive tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes