LGMLDec 27, 2019

Efficient Data Analytics on Augmented Similarity Triplets

arXiv:1912.12064v38 citations
Originality Incremental advance
AI Analysis

This work addresses data analysis challenges for scenarios where distance information is provided as triplets, offering incremental improvements in efficiency and scalability.

The paper tackles the problem of data analysis using similarity triplets by proposing triplets augmentation to infer hidden information and scalable algorithms that avoid kernel evaluations, demonstrating improved performance and robustness to noise.

Data analysis require a pairwise proximity measure over objects. Recent work has extended this to situations where the distance information between objects is given as comparison results of distances between three objects (triplets). Humans find the comparison tasks much easier than the exact distance computation and such data can be easily obtained in big quantity via crowd-sourcing. In this work, we propose triplets augmentation, an efficient method to extend the triplets data by inferring the hidden implicit information form the existing data. Triplets augmentation improves the quality of kernel-based and kernel-free data analytics. We also propose a novel set of algorithms for common data analysis tasks based on triplets. These methods work directly with triplets and avoid kernel evaluations, thus are scalable to big data. We demonstrate that our methods outperform the current best-known techniques and are robust to noisy data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes