Franz Kiraly

LG
5papers
129citations
Novelty32%
AI Score22

5 Papers

LGJul 23, 2020Code
MLJ: A Julia package for composable machine learning

Anthony D. Blaom, Franz Kiraly, Thibaut Lienart et al.

MLJ (Machine Learing in Julia) is an open source software package providing a common interface for interacting with machine learning models written in Julia and other languages. It provides tools and meta-algorithms for selecting, tuning, evaluating, composing and comparing those models, with a focus on flexible model composition. In this design overview we detail chief novelties of the framework, together with the clear benefits of Julia over the dominant multi-language alternatives.

SESep 7, 2020
distr6: R6 Object-Oriented Probability Distributions Interface in R

Raphael Sonabend, Franz Kiraly

distr6 is an object-oriented (OO) probability distributions interface leveraging the extensibility and scalability of R6, and the speed and efficiency of Rcpp. Over 50 probability distributions are currently implemented in the package with `core' methods including density, distribution, and generating functions, and more `exotic' ones including hazards and distribution function anti-derivatives. In addition to simple distributions, distr6 supports compositions such as truncation, mixtures, and product distributions. This paper presents the core functionality of the package and demonstrates examples for key use-cases. In addition this paper provides a critical review of the object-oriented programming paradigms in R and describes some novel implementations for design patterns and core object-oriented features introduced by the package for supporting distr6 components.

CRAug 23, 2019
Design choices for productive, secure, data-intensive research at scale in the cloud

Diego Arenas, Jon Atkins, Claire Austin et al.

We present a policy and process framework for secure environments for productive data science research projects at scale, by combining prevailing data security threat and risk profiles into five sensitivity tiers, and, at each tier, specifying recommended policies for data classification, data ingress, software ingress, data egress, user access, user device control, and analysis environments. By presenting design patterns for security choices for each tier, and using software defined infrastructure so that a different, independent, secure research environment can be instantiated for each project appropriate to its classification, we hope to maximise researcher productivity and minimise risk, allowing research organisations to operate with confidence.

MLJun 11, 2014
Algebraic-Combinatorial Methods for Low-Rank Matrix Completion with Application to Athletic Performance Prediction

Duncan A. J. Blythe, Louis Theran, Franz Kiraly

This paper presents novel algorithms which exploit the intrinsic algebraic and combinatorial structure of the matrix completion task for estimating missing en- tries in the general low rank setting. For positive data, we achieve results out- performing the state of the art nuclear norm, both in accuracy and computational efficiency, in simulations and in the task of predicting athletic performance from partially observed data.

LGJun 27, 2012
A Combinatorial Algebraic Approach for the Identifiability of Low-Rank Matrix Completion

Franz Kiraly, Ryota Tomioka

In this paper, we review the problem of matrix completion and expose its intimate relations with algebraic geometry, combinatorics and graph theory. We present the first necessary and sufficient combinatorial conditions for matrices of arbitrary rank to be identifiable from a set of matrix entries, yielding theoretical constraints and new algorithms for the problem of matrix completion. We conclude by algorithmically evaluating the tightness of the given conditions and algorithms for practically relevant matrix sizes, showing that the algebraic-combinatoric approach can lead to improvements over state-of-the-art matrix completion methods.