LGCOMLMay 16, 2024

EKM: An exact, polynomial-time algorithm for the $K$-medoids problem

arXiv:2405.12237v12 citationsh-index: 2
Originality Highly original
AI Analysis

This solves a long-standing challenge in data analysis by providing the first provably polynomial-time exact algorithm for the widely used K-medoids problem.

The paper tackles the K-medoids clustering problem by presenting EKM, an exact algorithm with worst-case O(N^{K+1}) time complexity, which outperforms approximate methods and exponential-time solvers in experiments on real-world and synthetic datasets.

The $K$-medoids problem is a challenging combinatorial clustering task, widely used in data analysis applications. While numerous algorithms have been proposed to solve this problem, none of these are able to obtain an exact (globally optimal) solution for the problem in polynomial time. In this paper, we present EKM: a novel algorithm for solving this problem exactly with worst-case $O\left(N^{K+1}\right)$ time complexity. EKM is developed according to recent advances in transformational programming and combinatorial generation, using formal program derivation steps. The derived algorithm is provably correct by construction. We demonstrate the effectiveness of our algorithm by comparing it against various approximate methods on numerous real-world datasets. We show that the wall-clock run time of our algorithm matches the worst-case time complexity analysis on synthetic datasets, clearly outperforming the exponential time complexity of benchmark branch-and-bound based MIP solvers. To our knowledge, this is the first, rigorously-proven polynomial time, practical algorithm for this ubiquitous problem.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes