Franz J. Király

ML
16papers
570citations
Novelty46%
AI Score25

16 Papers

LGAug 19, 2014
The Algebraic Combinatorial Approach for Low-Rank Matrix Completion

Franz J. Király, Louis Theran, Ryota Tomioka

We present a novel algebraic combinatorial view on low-rank matrix completion based on studying relations between a few entries with tools from algebraic geometry and matroid theory. The intrinsic locality of the approach allows for the treatment of single entries in a closed theoretical and practical framework. More specifically, apart from introducing an algebraic combinatorial theory of low-rank matrix completion, we present probability-one algorithms to decide whether a particular entry of the matrix can be completed. We also describe methods to complete that entry from a few others, and to estimate the error which is incurred by any method completing that entry. Furthermore, we show how known results on matrix completion and their sampling assumptions can be related to our new perspective and interpreted in terms of a completability phase transition.

MLNov 30, 2012
Approximate Rank-Detecting Factorization of Low-Rank Tensors

Franz J. Király, Andreas Ziehe

We present an algorithm, AROFAC2, which detects the (CP-)rank of a degree 3 tensor and calculates its factorization into rank-one components. We provide generative conditions for the algorithm to work and demonstrate on both synthetic and real world data that AROFAC2 is a potentially outperforming alternative to the gold standard PARAFAC over which it has the advantages that it can intrinsically detect the true rank, avoids spurious components, and is stable with respect to outliers and non-Gaussian noise.

SEJan 13, 2021
Designing Machine Learning Toolboxes: Concepts, Principles and Patterns

Franz J. Király, Markus Löning, Anthony Blaom et al.

Machine learning (ML) and AI toolboxes such as scikit-learn or Weka are workhorses of contemporary data scientific practice -- their central role being enabled by usable yet powerful designs that allow to easily specify, train and validate complex modeling pipelines. However, despite their universal success, the key design principles in their construction have never been fully analyzed. In this paper, we attempt to provide an overview of key patterns in the design of AI modeling toolboxes, taking inspiration, in equal parts, from the field of software engineering, implementation patterns found in contemporary toolboxes, and our own experience from developing ML toolboxes. In particular, we develop a conceptual model for the AI/ML domain, with a new type system, called scientific types, at its core. Scientific types capture the scientific meaning of common elements in ML workflows based on the set of operations that we usually perform with them (i.e. their interface) and their statistical properties. From our conceptual analysis, we derive a set of design principles and patterns. We illustrate that our analysis can not only explain the design of existing toolboxes, but also guide the development of new ones. We intend our contribution to be a state-of-art reference for future toolbox engineers, a summary of best practices, a collection of ML design patterns which may become useful for future research, and, potentially, the first steps towards a higher-level programming paradigm for constructing AI.

COAug 18, 2020
mlr3proba: An R Package for Machine Learning in Survival Analysis

Raphael Sonabend, Franz J. Király, Andreas Bender et al.

As machine learning has become increasingly popular over the last few decades, so too has the number of machine learning interfaces for implementing these models. Whilst many R libraries exist for machine learning, very few offer extended support for survival analysis. This is problematic considering its importance in fields like medicine, bioinformatics, economics, engineering, and more. mlr3proba provides a comprehensive machine learning interface for survival analysis and connects with mlr3's general model tuning and benchmarking facilities to provide a systematic infrastructure for survival modeling and evaluation.

MLApr 18, 2020
Kernels for time series with irregularly-spaced multivariate observations

Ahmed Guecioueur, Franz J. Király

Time series are an interesting frontier for kernel-based methods, for the simple reason that there is no kernel designed to represent them and their unique characteristics in full generality. Existing sequential kernels ignore the time indices, with many assuming that the series must be regularly-spaced; some such kernels are not even psd. In this manuscript, we show that a "series kernel" that is general enough to represent irregularly-spaced multivariate time series may be built out of well-known "vector kernels". We also show that all series kernels constructed using our methodology are psd, and are thus widely applicable. We demonstrate this point by formulating a Gaussian process-based strategy - with our series kernel at its heart - to make predictions about test series when given a training set. We validate the strategy experimentally by estimating its generalisation error on multiple datasets and comparing it to relevant baselines. We also demonstrate that our series kernel may be used for the more traditional setting of time series classification, where its performance is broadly in line with alternative methods.

LGSep 17, 2019
sktime: A Unified Interface for Machine Learning with Time Series

Markus Löning, Anthony Bagnall, Sajaysurya Ganesh et al.

We present sktime -- a new scikit-learn compatible Python library with a unified interface for machine learning with time series. Time series data gives rise to various distinct but closely related learning tasks, such as forecasting and time series classification, many of which can be solved by reducing them to related simpler tasks. We discuss the main rationale for creating a unified interface, including reduction, as well as the design of sktime's core API, supported by a clear overview of common time series tasks and reduction approaches.

LGJan 11, 2019
Machine Learning Automation Toolbox (MLaut)

Viktor Kazakov, Franz J. Király

In this paper we present MLaut (Machine Learning AUtomation Toolbox) for the python data science ecosystem. MLaut automates large-scale evaluation and benchmarking of machine learning algorithms on a large number of datasets. MLaut provides a high-level workflow interface to machine algorithm algorithms, implements a local back-end to a database of dataset collections, trained algorithms, and experimental results, and provides easy-to-use interfaces to the scikit-learn and keras modelling libraries. Experiments are easy to set up with default settings in a few lines of code, while remaining fully customizable to the level of hyper-parameter tuning, pipeline composition, or deep learning architecture. As a principal test case for MLaut, we conducted a large-scale supervised classification study in order to benchmark the performance of a number of machine learning algorithms - to our knowledge also the first larger-scale study on standard supervised learning data sets to include deep learning algorithms. While corroborating a number of previous findings in literature, we found (within the limitations of our study) that deep neural networks do not perform well on basic supervised learning, i.e., outside the more specialized, image-, audio-, or text-based tasks.

APJul 4, 2018
Modeling outcomes of soccer matches

Alkeos Tsokos, Santhosh Narayanan, Ioannis Kosmidis et al.

We compare various extensions of the Bradley-Terry model and a hierarchical Poisson log-linear model in terms of their performance in predicting the outcome of soccer matches (win, draw, or loss). The parameters of the Bradley-Terry extensions are estimated by maximizing the log-likelihood, or an appropriately penalized version of it, while the posterior densities of the parameters of the hierarchical Poisson log-linear model are approximated using integrated nested Laplace approximations. The prediction performance of the various modeling approaches is assessed using a novel, context-specific framework for temporal validation that is found to deliver accurate estimates of the test error. The direct modeling of outcomes via the various Bradley-Terry extensions and the modeling of match scores using the hierarchical Poisson log-linear model demonstrate similar behavior in terms of predictive performance.

MLJan 2, 2018
Probabilistic supervised learning

Frithjof Gressmann, Franz J. Király, Bilal Mateen et al.

Predictive modelling and supervised learning are central to modern data science. With predictions from an ever-expanding number of supervised black-box strategies - e.g., kernel methods, random forests, deep learning aka neural networks - being employed as a basis for decision making processes, it is crucial to understand the statistical uncertainty associated with these predictions. As a general means to approach the issue, we present an overarching framework for black-box prediction strategies that not only predict the target but also their own predictions' uncertainty. Moreover, the framework allows for fair assessment and comparison of disparate prediction strategies. For this, we formally consider strategies capable of predicting full distributions from feature variables, so-called probabilistic supervised learning strategies. Our work draws from prior work including Bayesian statistics, information theory, and modern supervised machine learning, and in a novel synthesis leads to (a) new theoretical insights such as a probabilistic bias-variance decomposition and an entropic formulation of prediction, as well as to (b) new algorithms and meta-algorithms, such as composite prediction strategies, probabilistic boosting and bagging, and a probabilistic predictive independence test. Our black-box formulation also leads (c) to a new modular interface view on probabilistic supervised learning and a modelling workflow API design, which we have implemented in the newly released skpro machine learning toolbox, extending the familiar modelling interface and meta-modelling functionality of sklearn. The skpro package provides interfaces for construction, composition, and tuning of probabilistic supervised learning strategies, together with orchestration features for validation and comparison of any such strategy - be it frequentist, Bayesian, or other.

MLJan 27, 2017
Modelling Competitive Sports: Bradley-Terry-Élő Models for Supervised and On-Line Learning of Paired Competition Outcomes

Franz J. Király, Zhaozhi Qian

Prediction and modelling of competitive sports outcomes has received much recent attention, especially from the Bayesian statistics and machine learning communities. In the real world setting of outcome prediction, the seminal Élő update still remains, after more than 50 years, a valuable baseline which is difficult to improve upon, though in its original form it is a heuristic and not a proper statistical "model". Mathematically, the Élő rating system is very closely related to the Bradley-Terry models, which are usually used in an explanatory fashion rather than in a predictive supervised or on-line learning setting. Exploiting this close link between these two model classes and some newly observed similarities, we propose a new supervised learning framework with close similarities to logistic regression, low-rank matrix completion and neural networks. Building on it, we formulate a class of structured log-odds models, unifying the desirable properties found in the above: supervised probabilistic prediction of scores and wins/draws/losses, batch/epoch and on-line learning, as well as the possibility to incorporate features in the prediction, without having to sacrifice simplicity, parsimony of the Bradley-Terry models, or computational efficiency of Élő's original approach. We validate the structured log-odds modelling approach in synthetic experiments and English Premier League outcomes, where the added expressivity yields the best predictions reported in the state-of-art, close to the quality of contemporary betting odds.

APMay 5, 2015
Prediction and Quantification of Individual Athletic Performance

Duncan A. J. Blythe, Franz J. Király

We provide scientific foundations for athletic performance prediction on an individual level, exposing the phenomenology of individual athletic running performance in the form of a low-rank model dominated by an individual power law. We present, evaluate, and compare a selection of methods for prediction of individual running performance, including our own, \emph{local matrix completion} (LMC), which we show to perform best. We also show that many documented phenomena in quantitative sports science, such as the form of scoring tables, the success of existing prediction methods including Riegel's formula, the Purdy points scheme, the power law for world records performances and the broken power law for world record speeds may be explained on the basis of our findings in a unified way.

MLNov 28, 2014
Learning with Algebraic Invariances, and the Invariant Kernel Trick

Franz J. Király, Andreas Ziehe, Klaus-Robert Müller

When solving data analysis problems it is important to integrate prior knowledge and/or structural invariances. This paper contributes by a novel framework for incorporating algebraic invariance structure into kernels. In particular, we show that algebraic properties such as sign symmetries in data, phase independence, scaling etc. can be included easily by essentially performing the kernel trick twice. We demonstrate the usefulness of our theory in simulations on selected applications such as sign-invariant spectral clustering and underdetermined ICA.

MLFeb 1, 2014
Dual-to-kernel learning with ideals

Franz J. Király, Martin Kreuzer, Louis Theran

In this paper, we propose a theory which unifies kernel learning and symbolic algebraic methods. We show that both worlds are inherently dual to each other, and we use this duality to combine the structure-awareness of algebraic methods with the efficiency and generality of kernels. The main idea lies in relating polynomial rings to feature space, and ideals to manifolds, then exploiting this generative-discriminative duality on kernel matrices. We illustrate this by proposing two algorithms, IPCA and AVICA, for simultaneous manifold and feature learning, and test their accuracy on synthetic and real world data.

MLSep 12, 2013
Efficient Orthogonal Tensor Decomposition, with an Application to Latent Variable Model Learning

Franz J. Király

Decomposing tensors into orthogonal factors is a well-known task in statistics, machine learning, and signal processing. We study orthogonal outer product decompositions where the factors in the summands in the decomposition are required to be orthogonal across summands, by relating this orthogonal decomposition to the singular value decompositions of the flattenings. We show that it is a non-trivial assumption for a tensor to have such an orthogonal decomposition, and we show that it is unique (up to natural symmetries) in case it exists, in which case we also demonstrate how it can be efficiently and reliably obtained by a sequence of singular value decompositions. We demonstrate how the factoring algorithm can be applied for parameter identification in latent variable and mixture models.

MLFeb 21, 2013
Obtaining error-minimizing estimates and universal entry-wise error bounds for low-rank matrix completion

Franz J. Király, Louis Theran

We propose a general framework for reconstructing and denoising single entries of incomplete and noisy entries. We describe: effective algorithms for deciding if and entry can be reconstructed and, if so, for reconstructing and denoising it; and a priori bounds on the error of each entry, individually. In the noiseless case our algorithm is exact. For rank-one matrices, the new algorithm is fast, admits a highly-parallel implementation, and produces an error minimizing estimate that is qualitatively close to our theoretical and the state-of-the-are Nuclear Norm and OptSpace methods.

LGFeb 12, 2013
Coherence and sufficient sampling densities for reconstruction in compressed sensing

Franz J. Király, Louis Theran

We give a new, very general, formulation of the compressed sensing problem in terms of coordinate projections of an analytic variety, and derive sufficient sampling rates for signal reconstruction. Our bounds are linear in the coherence of the signal space, a geometric parameter independent of the specific signal and measurement, and logarithmic in the ambient dimension where the signal is presented. We exemplify our approach by deriving sufficient sampling densities for low-rank matrix completion and distance matrix completion which are independent of the true matrix.