Mikko Heikkilä

h-index8

13papers

203citations

Novelty50%

AI Score42

Ranked #58,813 of 194,257 authors (top 30%)#13,367 in LG (top 33%)

13 Papers

4.6LGSep 23, 2022Code

Differentially private partitioned variational inference

Mikko A. Heikkilä, Matthew Ashman, Siddharth Swaroop et al.

Learning a privacy-preserving model from sensitive data which are distributed across multiple devices is an increasingly important problem. The problem is often formulated in the federated learning context, with the aim of learning a single global model while keeping the data distributed. Moreover, Bayesian learning is a popular approach for modelling, since it naturally supports reliable uncertainty estimates. However, Bayesian learning is generally intractable even with centralised non-private data and so approximation techniques such as variational inference are a necessity. Variational inference has recently been extended to the non-private federated learning setting via the partitioned variational inference algorithm. For privacy protection, the current gold standard is called differential privacy. Differential privacy guarantees privacy in a strong, mathematically clearly defined sense. In this paper, we present differentially private partitioned variational inference, the first general framework for learning a variational approximation to a Bayesian posterior distribution in the federated learning setting while minimising the number of communication rounds and providing differential privacy guarantees for data subjects. We propose three alternative implementations in the general framework, one based on perturbing local optimisation runs done by individual parties, and two based on perturbing updates to the global model (one using a version of federated averaging, the second one adding virtual parties to the protocol), and compare their properties both theoretically and empirically.

4.6LGJul 27, 2024

On Using Secure Aggregation in Differentially Private Federated Learning with Multiple Local Steps

Mikko A. Heikkilä

Federated learning is a distributed learning setting where the main aim is to train machine learning models without having to share raw data but only what is required for learning. To guarantee training data privacy and high-utility models, differential privacy and secure aggregation techniques are often combined with federated learning. However, with fine-grained protection granularities, e.g., with the common sample-level protection, the currently existing techniques generally require the parties to communicate for each local optimization step, if they want to fully benefit from the secure aggregation in terms of the resulting formal privacy guarantees. In this paper, we show how a simple new analysis allows the parties to perform multiple local optimization steps while still benefiting from using secure aggregation. We show that our analysis enables higher utility models with guaranteed privacy protection under limited number of communication rounds.

12.2ASMar 8, 2024Code

Speech Robust Bench: A Robustness Benchmark For Speech Recognition

Muhammad A. Shah, David Solans Noguero, Mikko A. Heikkila et al.

As Automatic Speech Recognition (ASR) models become ever more pervasive, it is important to ensure that they make reliable predictions under corruptions present in the physical and digital world. We propose Speech Robust Bench (SRB), a comprehensive benchmark for evaluating the robustness of ASR models to diverse corruptions. SRB is composed of 114 input perturbations which simulate an heterogeneous range of corruptions that ASR models may encounter when deployed in the wild. We use SRB to evaluate the robustness of several state-of-the-art ASR models and observe that model size and certain modeling choices such as the use of discrete representations, or self-training appear to be conducive to robustness. We extend this analysis to measure the robustness of ASR models on data from various demographic subgroups, namely English and Spanish speakers, and males and females. Our results revealed noticeable disparities in the model's robustness across subgroups. We believe that SRB will significantly facilitate future research towards robust ASR models, by making it easier to conduct comprehensive and comparable robustness evaluations.

15.7LGNov 19, 2024

Non-IID data in Federated Learning: A Survey with Taxonomy, Metrics, Methods, Frameworks and Future Directions

Daniel M. Jimenez G., David Solans, Mikko Heikkila et al.

Recent advances in machine learning have highlighted Federated Learning (FL) as a promising approach that enables multiple distributed users (so-called clients) to collectively train ML models without sharing their private data. While this privacy-preserving method shows potential, it struggles when data across clients is not independent and identically distributed (non-IID) data. The latter remains an unsolved challenge that can result in poorer model performance and slower training times. Despite the significance of non-IID data in FL, there is a lack of consensus among researchers about its classification and quantification. This technical survey aims to fill that gap by providing a detailed taxonomy for non-IID data, partition protocols, and metrics to quantify data heterogeneity. Additionally, we describe popular solutions to address non-IID data and standardized frameworks employed in FL with heterogeneous data. Based on our state-of-the-art survey, we present key lessons learned and suggest promising future research directions.

7.1LGOct 23, 2025

On Optimal Hyperparameters for Differentially Private Deep Transfer Learning

Aki Rehn, Linzh Zhao, Mikko A. Heikkilä et al.

Differentially private (DP) transfer learning, i.e., fine-tuning a pretrained model on private data, is the current state-of-the-art approach for training large models under privacy constraints. We focus on two key hyperparameters in this setting: the clipping bound $C$ and batch size $B$. We show a clear mismatch between the current theoretical understanding of how to choose an optimal $C$ (stronger privacy requires smaller $C$) and empirical outcomes (larger $C$ performs better under strong privacy), caused by changes in the gradient distributions. Assuming a limited compute budget (fixed epochs), we demonstrate that the existing heuristics for tuning $B$ do not work, while cumulative DP noise better explains whether smaller or larger batches perform better. We also highlight how the common practice of using a single $(C,B)$ setting across tasks can lead to suboptimal performance. We find that performance drops especially when moving between loose and tight privacy and between plentiful and limited compute, which we explain by analyzing clipping as a form of gradient re-weighting and examining cumulative DP noise.

9.4LGJun 2, 2025

Mitigating Disparate Impact of Differentially Private Learning through Bounded Adaptive Clipping

Linzh Zhao, Aki Rehn, Mikko A. Heikkilä et al.

Differential privacy (DP) has become an essential framework for privacy-preserving machine learning. Existing DP learning methods, however, often have disparate impacts on model predictions, e.g., for minority groups. Gradient clipping, which is often used in DP learning, can suppress larger gradients from challenging samples. We show that this problem is amplified by adaptive clipping, which will often shrink the clipping bound to tiny values to match a well-fitting majority, while significantly reducing the accuracy for others. We propose bounded adaptive clipping, which introduces a tunable lower bound to prevent excessive gradient suppression. Our method improves the accuracy of the worst-performing class on average over 10 percentage points on skewed MNIST and Fashion MNIST compared to the unbounded adaptive clipping, and over 5 percentage points over constant clipping.

16.0CRJun 1, 2021

Tight Accounting in the Shuffle Model of Differential Privacy

Antti Koskela, Mikko A. Heikkilä, Antti Honkela

Shuffle model of differential privacy is a novel distributed privacy model based on a combination of local privacy mechanisms and a secure shuffler. It has been shown that the additional randomisation provided by the shuffler improves privacy bounds compared to the purely local mechanisms. Accounting tight bounds, however, is complicated by the complexity brought by the shuffler. The recently proposed numerical techniques for evaluating $(\varepsilon,δ)$-differential privacy guarantees have been shown to give tighter bounds than commonly used methods for compositions of various complex mechanisms. In this paper, we show how to obtain accurate bounds for adaptive compositions of general $\varepsilon$-LDP shufflers using the analysis by Feldman et al. (2021) and tight bounds for adaptive compositions of shufflers of $k$-randomised response mechanisms, using the analysis by Balle et al. (2019). We show how to speed up the evaluation of the resulting privacy loss distribution from $\mathcal{O}(n^2)$ to $\mathcal{O}(n)$, where $n$ is the number of users, without noticeable change in the resulting $δ(\varepsilon)$-upper bounds. We also demonstrate looseness of the existing bounds and methods found in the literature, improving previous composition results significantly.

15.4CRJul 10, 2020Code

Differentially private cross-silo federated learning

Mikko A. Heikkilä, Antti Koskela, Kana Shimizu et al.

Strict privacy is of paramount importance in distributed machine learning. Federated learning, with the main idea of communicating only what is needed for learning, has been recently introduced as a general approach for distributed learning to enhance learning and improve security. However, federated learning by itself does not guarantee any privacy for data subjects. To quantify and control how much privacy is compromised in the worst-case, we can use differential privacy. In this paper we combine additively homomorphic secure summation protocols with differential privacy in the so-called cross-silo federated learning setting. The goal is to learn complex models like neural networks while guaranteeing strict privacy for the individual data subjects. We demonstrate that our proposed solutions give prediction accuracy that is comparable to the non-distributed setting, and are fast enough to enable learning models with millions of parameters in a reasonable time. To enable learning under strict privacy guarantees that need privacy amplification by subsampling, we present a general algorithm for oblivious distributed subsampling. However, we also argue that when malicious parties are present, a simple approach using distributed Poisson subsampling gives better privacy. Finally, we show that by leveraging random projections we can further scale-up our approach to larger models while suffering only a modest performance loss.

12.6MLJan 29, 2019Code

Differentially Private Markov Chain Monte Carlo

Mikko A. Heikkilä, Joonas Jälkö, Onur Dikmen et al.

Recent developments in differentially private (DP) machine learning and DP Bayesian learning have enabled learning under strong privacy guarantees for the training data subjects. In this paper, we further extend the applicability of DP Bayesian learning by presenting the first general DP Markov chain Monte Carlo (MCMC) algorithm whose privacy-guarantees are not subject to unrealistic assumptions on Markov chain convergence and that is applicable to posterior inference in arbitrary models. Our algorithm is based on a decomposition of the Barker acceptance test that allows evaluating the Rényi DP privacy cost of the accept-reject choice. We further show how to improve the DP guarantee through data subsampling and approximate acceptance tests.

3.3QMJan 29, 2019

Representation Transfer for Differentially Private Drug Sensitivity Prediction

Teppo Niinimäki, Mikko Heikkilä, Antti Honkela et al.

Motivation: Human genomic datasets often contain sensitive information that limits use and sharing of the data. In particular, simple anonymisation strategies fail to provide sufficient level of protection for genomic data, because the data are inherently identifiable. Differentially private machine learning can help by guaranteeing that the published results do not leak too much information about any individual data point. Recent research has reached promising results on differentially private drug sensitivity prediction using gene expression data. Differentially private learning with genomic data is challenging because it is more difficult to guarantee the privacy in high dimensions. Dimensionality reduction can help, but if the dimension reduction mapping is learned from the data, then it needs to be differentially private too, which can carry a significant privacy cost. Furthermore, the selection of any hyperparameters (such as the target dimensionality) needs to also avoid leaking private information. Results: We study an approach that uses a large public dataset of similar type to learn a compact representation for differentially private learning. We compare three representation learning methods: variational autoencoders, PCA and random projection. We solve two machine learning tasks on gene expression of cancer cell lines: cancer type classification, and drug sensitivity prediction. The experiments demonstrate significant benefit from all representation learning methods with variational autoencoders providing the most accurate predictions most often. Our results significantly improve over previous state-of-the-art in accuracy of differentially private drug sensitivity prediction.

1.6ROJun 13, 2018

Kinematics and Dynamic Modeling of a Planar Hydraulic Elastomer Actuator

Mahdi Momeni Kelageri, Mikko Heikkila, Jarno Jokinen et al.

This paper presents modeling of a compliant 2D manipulator, a so called soft hydraulic/fluidic elastomer actuator. Our focus is on fiber-Reinforced Fluidic Elastomer Actuators (RFEA) driven by a constant pressure hydraulic supply and modulated on/off valves. We present a model that not only provides the dynamics behavior of the system but also the kinematics of the actuator. In addition to that, the relation between the applied hydraulic pressure and the bending angle of the soft actuator and thus, its tip position is formulated in a systematic way. We also present a steady state model that calculates the bending angle given the fluid pressure which can be beneficial to find out the initial values of the parameters during the system identification process. Our experimental results verify and validate the performance of the proposed modeling approach both in transition and steady states. Due to its inherent simplicity, this model shall also be used in real-time control of the soft actuators.

1.6ROJun 13, 2018

Design, Fabrication and Control of an Hydraulic Elastomer Actuator

Mahdi Momeni Kelageri, Mikko Heikkila, Minna Poikelispaa et al.

This paper presents design, fabrication and control of a compliant 2D manipulator, a so called soft actuator. Our focus is on fiber-reinforced elastomer actuators driven by a constant pressure hydraulic supply and modulated on/off valves. For a given diameters, we study the effect of four different elastomer materials and that of number of reinforcement fiber turns on forces generated by the actuator and maximum bending angles. For the rest of the study, we use polydimethylosiloxane (PDMS) with 240 fiber turns per 170mm length of actuator which withstand highest pressures and forces in our experiments. For the rest of the paper, we introduce two control methodologies. Firstly, we show that is possible to reasonably accurately control the pressure inside tube without measuring the pressure incorporating a simple linear tube model. This can be used, for example, in an inner-outer loop configuration with a PI position control to achieve high performance without the need for pressure measurement. Secondly, we experimentally show that a switching position control exhibits very good steady state accuracy and acceptable transient. Actuator tip position is measured using an external vision system. Our experiments included performance analysis of our soft manipulator while freely moving as well as when carrying a load.

10.7MLMar 3, 2017Code

Differentially Private Bayesian Learning on Distributed Data

Mikko Heikkilä, Eemil Lagerspetz, Samuel Kaski et al.

Many applications of machine learning, for example in health care, would benefit from methods that can guarantee privacy of data subjects. Differential privacy (DP) has become established as a standard for protecting learning results. The standard DP algorithms require a single trusted party to have access to the entire data, which is a clear weakness. We consider DP Bayesian learning in a distributed setting, where each party only holds a single sample or a few samples of the data. We propose a learning strategy based on a secure multi-party sum function for aggregating summaries from data holders and the Gaussian mechanism for DP. Our method builds on an asymptotically optimal and practically efficient DP Bayesian inference with rapidly diminishing extra cost.