LGNov 3, 2024
Explainable Clustering Beyond Worst-Case GuaranteesMaximilian Fleissner, Maedeh Zarvandi, Debarghya Ghoshdastidar
We study the explainable clustering problem first posed by Moshkovitz, Dasgupta, Rashtchian, and Frost (ICML 2020). The goal of explainable clustering is to fit an axis-aligned decision tree with $K$ leaves and minimal clustering cost (where every leaf is a cluster). The fundamental theoretical question in this line of work is the \textit{price of explainability}, defined as the ratio between the clustering cost of the tree and the optimal cost. Numerous papers have provided worst-case guarantees on this quantity. For $K$-medians, it has recently been shown that the worst-case price of explainability is $Θ(\log K)$. While this settles the matter from a data-agnostic point of view, two important questions remain unanswered: Are tighter guarantees possible for well-clustered data? And can we trust decision trees to recover underlying cluster structures? In this paper, we place ourselves in a statistical setting of mixture models to answer both questions. We prove that better guarantees are indeed feasible for well-clustered data. Our algorithm takes as input a mixture model and constructs a tree in data-independent time. We then extend our analysis to kernel clustering, deriving new guarantees that significantly improve over existing worst-case bounds.
LGSep 29, 2025
Interpretable Kernel Representation Learning at Scale: A Unified Framework Utilizing Nyström ApproximationMaedeh Zarvandi, Michael Timothy, Theresa Wasserer et al.
Kernel methods provide a theoretically grounded framework for non-linear and non-parametric learning, with strong analytic foundations and statistical guarantees. Yet, their scalability has long been limited by prohibitive time and memory costs. While progress has been made in scaling kernel regression, no framework exists for scalable kernel-based representation learning, restricting their use in the era of foundation models where representations are learned from massive unlabeled data. We introduce KREPES -- a unified, scalable framework for kernel-based representation learning via Nyström approximation. KREPES accommodates a wide range of unsupervised and self-supervised losses, and experiments on large image and tabular datasets demonstrate its efficiency. Crucially, KREPES enables principled interpretability of the learned representations, an immediate benefit over deep models, which we substantiate through dedicated analysis.
CVJul 6, 2021
Stateless actor-critic for instance segmentation with high-level priorsPaul Hilt, Maedeh Zarvandi, Edgar Kaziakhmedov et al.
Instance segmentation is an important computer vision problem which remains challenging despite impressive recent advances due to deep learning-based methods. Given sufficient training data, fully supervised methods can yield excellent performance, but annotation of ground-truth data remains a major bottleneck, especially for biomedical applications where it has to be performed by domain experts. The amount of labels required can be drastically reduced by using rules derived from prior knowledge to guide the segmentation. However, these rules are in general not differentiable and thus cannot be used with existing methods. Here, we relax this requirement by using stateless actor critic reinforcement learning, which enables non-differentiable rewards. We formulate the instance segmentation problem as graph partitioning and the actor critic predicts the edge weights driven by the rewards, which are based on the conformity of segmented instances to high-level priors on object shape, position or size. The experiments on toy and real datasets demonstrate that we can achieve excellent performance without any direct supervision based only on a rich set of priors.