LGAILOFeb 19, 2023

Greedy Discovery of Ordinal Factors

arXiv:2302.11554v14 citationsh-index: 53
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of navigating complex tagged datasets for users, but it is incremental as it builds on existing formal concept analysis methods.

The paper tackles the problem of discovering and analyzing structure in large datasets with tags by proposing a greedy algorithm for ordinal factor analysis, which arranges tags in linear order to represent the dataset and enables relationship discovery, evaluated through case studies.

In large datasets, it is hard to discover and analyze structure. It is thus common to introduce tags or keywords for the items. In applications, such datasets are then filtered based on these tags. Still, even medium-sized datasets with a few tags result in complex and for humans hard-to-navigate systems. In this work, we adopt the method of ordinal factor analysis to address this problem. An ordinal factor arranges a subset of the tags in a linear order based on their underlying structure. A complete ordinal factorization, which consists of such ordinal factors, precisely represents the original dataset. Based on such an ordinal factorization, we provide a way to discover and explain relationships between different items and attributes in the dataset. However, computing even just one ordinal factor of high cardinality is computationally complex. We thus propose the greedy algorithm in this work. This algorithm extracts ordinal factors using already existing fast algorithms developed in formal concept analysis. Then, we leverage to propose a comprehensive way to discover relationships in the dataset. We furthermore introduce a distance measure based on the representation emerging from the ordinal factorization to discover similar items. To evaluate the method, we conduct a case study on different datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes