Michael Huth

CR
10papers
59citations
Novelty39%
AI Score22

10 Papers

LGJun 29, 2022
Extreme compression of sentence-transformer ranker models: faster inference, longer battery life, and less storage on edge devices

Amit Chaulwar, Lukas Malik, Maciej Krajewski et al.

Modern search systems use several large ranker models with transformer architectures. These models require large computational resources and are not suitable for usage on devices with limited computational resources. Knowledge distillation is a popular compression technique that can reduce the resource needs of such models, where a large teacher model transfers knowledge to a small student model. To drastically reduce memory requirements and energy consumption, we propose two extensions for a popular sentence-transformer distillation procedure: generation of an optimal size vocabulary and dimensionality reduction of the embedding dimension of teachers prior to distillation. We evaluate these extensions on two different types of ranker models. This results in extremely compressed student models whose analysis on a test dataset shows the significance and utility of our proposed extensions.

AIMay 15, 2017Code
Constrained Bayesian Networks: Theory, Optimization, and Applications

Paul Beaumont, Michael Huth

We develop the theory and practice of an approach to modelling and probabilistic inference in causal networks that is suitable when application-specific or analysis-specific constraints should inform such inference or when little or no data for the learning of causal network structure or probability values at nodes are available. Constrained Bayesian Networks generalize a Bayesian Network such that probabilities can be symbolic, arithmetic expressions and where the meaning of the network is constrained by finitely many formulas from the theory of the reals. A formal semantics for constrained Bayesian Networks over first-order logic of the reals is given, which enables non-linear and non-convex optimisation algorithms that rely on decision procedures for this logic, and supports the composition of several constrained Bayesian Networks. A non-trivial case study in arms control, where few or no data are available to assess the effectiveness of an arms inspection process, evaluates our approach. An open-access prototype implementation of these foundations and their algorithms uses the SMT solver Z3 as decision procedure, leverages an open-source package for Bayesian inference to symbolic computation, and is evaluated experimentally.

CRJul 28, 2021
Secure Bayesian Federated Analytics for Privacy-Preserving Trend Detection

Amit Chaulwar, Michael Huth

Federated analytics has many applications in edge computing, its use can lead to better decision making for service provision, product development, and user experience. We propose a Bayesian approach to trend detection in which the probability of a keyword being trendy, given a dataset, is computed via Bayes' Theorem; the probability of a dataset, given that a keyword is trendy, is computed through secure aggregation of such conditional probabilities over local datasets of users. We propose a protocol, named SAFE, for Bayesian federated analytics that offers sufficient privacy for production grade use cases and reduces the computational burden of users and an aggregator. We illustrate this approach with a trend detection experiment and discuss how this approach could be extended further to make it production-ready.

ITSep 20, 2020
Two and Three-Party Digital Goods Auctions: Scalable Privacy Analysis

Patrick Ah-Fat, Michael Huth

A digital goods auction is a type of auction where potential buyers bid the maximal price that they are willing to pay for a certain item, which a seller can produce at a negligible cost and in unlimited quantity. To maximise her benefits, the aim for the seller is to find the optimal sales price, which every buyer whose bid is not lower will pay. For fairness and privacy purposes, buyers may be concerned about protecting the confidentiality of their bids. Secure Multi-Party Computation is a domain of Cryptography that would allow the seller to compute the optimal sales price while guaranteeing that the bids remain secret. Paradoxically, as a function of the buyers' bids, the sales price inevitably reveals some private information. Generic frameworks and entropy-based techniques based on Quantitative Information Flow have been developed in order to quantify and restrict those leakages. Due to their combinatorial nature, these techniques do not scale to large input spaces. In this work, we aim at scaling those privacy analyses to large input spaces in the particular case of digital goods auctions. We derive closed-form formulas for the posterior min-entropy of private inputs in two and three-party auctions, which enables us to effectively quantify the information leaks for arbitrarily large input spaces. We also provide supportive experimental evidence that enables us to formulate a conjecture that would allow us to extend our results to any number of parties.

CRJun 5, 2019
Owner-centric sharing of physical resources, data, and data-driven insights in digital ecosystems

Kwok Cheung, Michael Huth, Laurence Kirk et al.

We are living in an age in which digitization will connect more and more physical assets with IT systems and where IoT endpoints will generate a wealth of valuable data. Companies, individual users, and organizations alike therefore have the need to control their own physical or non-physical assets and data sources. At the same time, they recognize the need for, and opportunity to, share access to such data and digitized physical assets. This paper sets out our technology vision for such sharing ecosystems, reports initial work in that direction, identifies challenges for realizing this vision, and seeks feedback and collaboration from the academic access-control community in that R\&D space.

CRJan 3, 2019
Scalable Information-Flow Analysis of Secure Three-Party Affine Computations

Patrick Ah-Fat, Michael Huth

Elaborate protocols in Secure Multi-party Computation enable several participants to compute a public function of their own private inputs while ensuring that no undesired information leaks about the private inputs, and without resorting to any trusted third party. However, the public output of the computation inevitably leaks some information about the private inputs. Recent works have introduced a framework and proposed some techniques for quantifying such information flow. Yet, owing to their complexity, those methods do not scale to practical situations that may involve large input spaces. The main contribution of the work reported here is to formally investigate the information flow captured by the min-entropy in the particular case of secure three-party computations of affine functions in order to make its quantification scalable to realistic scenarios. To this end, we mathematically derive an explicit formula for this entropy under uniform prior beliefs about the inputs. We show that this closed-form expression can be computed in time constant in the inputs sizes and logarithmic in the coefficients of the affine function. Finally, we formulate some theoretical bounds for this privacy leak in the presence of non-uniform prior beliefs.

CRMar 20, 2018
Ontology-Based Reasoning about the Trustworthiness of Cyber-Physical Systems

Marcello Balduccini, Edward Griffor, Michael Huth et al.

It has been challenging for the technical and regulatory communities to formulate requirements for trustworthiness of the cyber-physical systems (CPS) due to the complexity of the issues associated with their design, deployment, and operations. The US National Institute of Standards and Technology (NIST), through a public working group, has released a CPS Framework that adopts a broad and integrated view of CPS and positions trustworthiness among other aspects of CPS. This paper takes the model created by the CPS Framework and its further developments one step further, by applying ontological approaches and reasoning techniques in order to achieve greater understanding of CPS. The example analyzed in the paper demonstrates the enrichment of the original CPS model obtained through ontology and reasoning and its ability to deliver additional insights to the developers and operators of CPS.

CRMar 1, 2018
Optimal Accuracy-Privacy Trade-Off for Secure Multi-Party Computations

Patrick Ah-Fat, Michael Huth

The purpose of Secure Multi-Party Computation is to enable protocol participants to compute a public function of their private inputs while keeping their inputs secret, without resorting to any trusted third party. However, opening the public output of such computations inevitably reveals some information about the private inputs. We propose a measure generalising both Renyi entropy and g-entropy so as to quantify this information leakage. In order to control and restrain such information flows, we introduce the notion of function substitution which replaces the computation of a function that reveals sensitive information with that of an approximate function. We exhibit theoretical bounds for the privacy gains that this approach provides and experimentally show that this enhances the confidentiality of the inputs while controlling the distortion of computed output values. Finally, we investigate the inherent compromise between accuracy of computation and privacy of inputs and we demonstrate how to realise such optimal trade-offs.

AIFeb 4, 2017
Manyopt: An Extensible Tool for Mixed, Non-Linear Optimization Through SMT Solving

Andrea Callia D'Iddio, Michael Huth

Optimization of Mixed-Integer Non-Linear Programming (MINLP) supports important decisions in applications such as Chemical Process Engineering. But current solvers have limited ability for deductive reasoning or the use of domain-specific theories, and the management of integrality constraints does not yet exploit automated reasoning tools such as SMT solvers. This seems to limit both scalability and reach of such tools in practice. We therefore present a tool, ManyOpt, for MINLP optimization that enables experimentation with reduction techniques which transform a MINLP problem to feasibility checking realized by an SMT solver. ManyOpt is similar to the SAT solver ManySAT in that it runs a specified number of such reduction techniques in parallel to get the strongest result on a given MINLP problem. The tool is implemented in layers, which we may see as features and where reduction techniques are feature vectors. Some of these features are inspired by known MINLP techniques whereas others are novel and specific to SMT. Our experimental results on standard benchmarks demonstrate the benefits of this approach. The tool supports a variety of SMT solvers and is easily extensible with new features, courtesy of its layered structure. For example, logical formulas for deductive reasoning are easily added to constrain further the optimization of a MINLP problem of interest.

CRDec 1, 2016
Optimizing Governed Blockchains for Financial Process Authentications

Leif-Nissen Lundbaek, Andrea Callia D'Iddio, Michael Huth

We propose the formal study of governed blockchains that are owned and controlled by organizations and that neither create cryptocurrencies nor provide any incentives to solvers of cryptographic puzzles. We view such approaches as frameworks in which system parts, such as the cryptographic puzzle, may be instantiated with different technology. Owners of such a blockchain procure puzzle solvers as resources they control, and use a mathematical model to compute optimal parameters for the cryptographic puzzle mechanism or other parts of the blockchain. We illustrate this approach with a use case in which blockchains record hashes of financial process transactions to increase their trustworthiness and that of their audits. For Proof of Work as cryptographic puzzle, we develop a detailed mathematical model to derive MINLP optimization problems for computing optimal Proof of Work configuration parameters that trade off potentially conflicting aspects such as availability, resiliency, security, and cost in this governed setting. We demonstrate the utility of such a mining calculus by solving some instances of this problem. This experimental validation is strengthened by statistical experiments that confirm the validity of random variables used in formulating our mathematical model. We hope that our work may facilitate the creation of domain-specific blockchains for a wide range of applications such as trustworthy information in Internet of Things systems and bespoke improvements of legacy financial services.