Antti Koskela

CR
h-index14
25papers
343citations
Novelty51%
AI Score57

25 Papers

CRJun 3
Accuracy-First Rényi Differential Privacy and Post-Processing Immunity

Ossi Räisä, Antti Koskela, Antti Honkela

The accuracy-first perspective of differential privacy addresses an important shortcoming by allowing a data analyst to adaptively adjust the quantitative privacy bound instead of sticking to a predetermined bound. Existing works on the accuracy-first perspective have neglected an important property of differential privacy known as post-processing immunity, which ensures that an adversary is not able to weaken the privacy guarantee by post-processing. We address this gap by determining which existing definitions in the accuracy-first perspective have post-processing immunity, and which do not. The only definition with post-processing immunity, pure ex-post privacy, lacks useful tools for practical problems, such as an ex-post analogue of the Gaussian mechanism, and an algorithm to check if accuracy on separate private validation set is high enough. To address this, we propose a new definition based on Rényi differential privacy that has post-processing immunity, and we develop basic theory and tools needed for practical applications. We demonstrate the practicality of our theory with applications to synthetic data generation and image classifier fine-tuning, where our algorithm successfully adjusts the privacy bound until an accuracy threshold is met on a private validation dataset.

CRMay 12
$f$-Differential Privacy Filters: Validity and Approximate Solutions

Long Tran, Antti Koskela, Ossi Räisä et al.

Accounting for privacy loss under fully adaptive composition -- where mechanism choice and privacy parameters may depend on the history of prior outputs -- is a central challenge in differential privacy (DP). Here, privacy filters are stopping rules ensuring a prescribed global budget is not exceeded. A leading candidate for optimal filter design is $f$-DP, which characterizes the full extent of adversarial hypothesis testing and recovers $(\varepsilon,δ)$-DP through piece-wise linear trade-off functions, while enabling tight $(\varepsilon,δ)$-DP accounting in standard compositions via tensor products. Yet whether such filters can be correctly defined under $f$-DP remains unclear. We show that the natural $f$-DP filter -- tracking path-wise accumulating tensor products and stopping when the prescribed curve is crossed -- is fundamentally invalid, precluding the direct use of standard efficient numerical Fast-Fourier-Transform accounting in the fully adaptive setting. We characterize this failure, establishing necessary and sufficient conditions for the natural filter's validity. Furthermore, we prove a fully adaptive central limit theorem for $f$-DP, establishing Gaussian convergence of cumulative privacy losses under full adaptivity. As a demonstration, we construct a closed-form approximate GDP filter for subsampled Gaussian mechanisms that provably outperforms RDP-based accounting in asymptotic regimes ($q\ll 1$ and $q\approx 1$) without tracking the full trade-off function, demonstrating that the slack in RDP is not intrinsic to adaptive composition -- though CLT-based approximations are known to be optimistic at realistic subsampling rates, a limitation that remains an open challenge.

CRSep 30, 2022
Individual Privacy Accounting with Gaussian Differential Privacy

Antti Koskela, Marlon Tobaben, Antti Honkela

Individual privacy accounting enables bounding differential privacy (DP) loss individually for each participant involved in the analysis. This can be informative as often the individual privacy losses are considerably smaller than those indicated by the DP bounds that are based on considering worst-case bounds at each data access. In order to account for the individual privacy losses in a principled manner, we need a privacy accountant for adaptive compositions of randomised mechanisms, where the loss incurred at a given data access is allowed to be smaller than the worst-case loss. This kind of analysis has been carried out for the Rényi differential privacy (RDP) by Feldman and Zrnic (2021), however not yet for the so-called optimal privacy accountants. We make first steps in this direction by providing a careful analysis using the Gaussian differential privacy which gives optimal bounds for the Gaussian mechanism, one of the most versatile DP mechanisms. This approach is based on determining a certain supermartingale for the hockey-stick divergence and on extending the Rényi divergence-based fully adaptive composition results by Feldman and Zrnic. We also consider measuring the individual $(\varepsilon,δ)$-privacy losses using the so-called privacy loss distributions. With the help of the Blackwell theorem, we can then make use of the RDP analysis to construct an approximative individual $(\varepsilon,δ)$-accountant.

LGJan 27, 2023
Practical Differentially Private Hyperparameter Tuning with Subsampling

Antti Koskela, Tejas Kulkarni

Tuning the hyperparameters of differentially private (DP) machine learning (ML) algorithms often requires use of sensitive data and this may leak private information via hyperparameter values. Recently, Papernot and Steinke (2022) proposed a certain class of DP hyperparameter tuning algorithms, where the number of random search samples is randomized itself. Commonly, these algorithms still considerably increase the DP privacy parameter $\varepsilon$ over non-tuned DP ML model training and can be computationally heavy as evaluating each hyperparameter candidate requires a new training run. We focus on lowering both the DP bounds and the computational cost of these methods by using only a random subset of the sensitive data for the hyperparameter tuning and by extrapolating the optimal values to a larger dataset. We provide a Rényi differential privacy analysis for the proposed method and experimentally show that it consistently leads to better privacy-utility trade-off than the baseline method by Papernot and Steinke.

NADec 29, 2018
Analysis of Krylov Subspace Approximation to Large Scale Differential Riccati Equations

Antti Koskela, Hermann Mena

We consider a Krylov subspace approximation method for the symmetric differential Riccati equation $\dot{X} = AX + XA^T + Q - XSX$, $X(0)=X_0$. The method we consider is based on projecting the large scale equation onto a Krylov subspace spanned by the matrix $A$ and the low rank factors of $X_0$ and $Q$. We prove that the method is structure preserving in the sense that it preserves two important properties of the exact flow, namely the positivity of the exact flow, and also the property of monotonicity. We also provide a theoretical a priori error analysis which shows a superlinear convergence of the method. This behavior is illustrated in the numerical experiments. Moreover, we derive an efficient a posteriori error estimate as well as discuss multiple time stepping combined with a cut of the rank of the numerical solution.

NAFeb 27, 2017
Disguised and new Quasi-Newton methods for nonlinear eigenvalue problems

Elias Jarlebring, Antti Koskela, Giampaolo Mele

In this paper we take a quasi-Newton approach to nonlinear eigenvalue problems (NEPs) of the type $M(λ)v=0$, where $M:\mathbb{C}\rightarrow\mathbb{C}^{n\times n}$ is a holomorphic function. We investigate which types of approximations of the Jacobian matrix lead to competitive algorithms, and provide convergence theory. The convergence analysis is based on theory for quasi-Newton methods and Keldysh's theorem for NEPs. We derive new algorithms and also show that several well-established methods for NEPs can be interpreted as quasi-Newton methods, and thereby provide insight to their convergence behavior. In particular, we establish quasi-Newton interpretations of Neumaier's residual inverse iteration and Ruhe's method of successive linear problems.

NADec 14, 2017
Krylov integrators for Hamiltonian systems

Antti Koskela

We consider Arnoldi like processes to obtain symplectic subspaces for Hamiltonian systems. Large systems are locally approximated by ones living in low dimensional subspaces; we especially consider Krylov subspaces and some extensions. This will be utilized in two ways: solve numerically local small dimensional systems or in a given numerical, e.g. exponential, integrator, use the subspace for approximations of necessary functions. In the former case one can expect an excellent energy preservation. For the latter this is so for linear systems. For some second order exponential integrators we consider these two approaches are shown to be equivalent. In numerical experiments with nonlinear Hamiltonian problems their behaviour seems promising.

LGJul 5, 2024
Convex Approximation of Two-Layer ReLU Networks for Hidden State Differential Privacy

Rob Romijnders, Antti Koskela

The hidden state threat model of differential privacy (DP) assumes that the adversary has access only to the final trained machine learning (ML) model, without seeing intermediate states during training. However, the current privacy analyses under this model are restricted to convex optimization problems, reducing their applicability to multi-layer neural networks, which are essential in modern deep learning applications. Notably, the most successful applications of the hidden state privacy analyses in classification tasks have only been for logistic regression models. We demonstrate that it is possible to privately train convex problems with privacy-utility trade-offs comparable to those of 2-layer ReLU networks trained with DP stochastic gradient descent (DP-SGD). This is achieved through a stochastic approximation of a dual formulation of the ReLU minimization problem, resulting in a strongly convex problem. This enables the use of existing hidden state privacy analyses and provides accurate privacy bounds also for the noisy cyclic mini-batch gradient descent (NoisyCGD) method with fixed disjoint mini-batches. Empirical results on benchmark classification tasks demonstrate that NoisyCGD can achieve privacy-utility trade-offs on par with DP-SGD applied to 2-layer ReLU networks.

NADec 12, 2017
On a generalization of the Bessel function Neumann expansion

Antti Koskela, Elias Jarlebring

The Bessel-Neumann expansion (of integer order) of a function $g:\mathbb{C}\rightarrow\mathbb{C}$ corresponds to representing $g$ as a linear combination of basis functions $ϕ_0,ϕ_1,\ldots$, i.e., $g(z)=\sum_{\ell = 0}^\infty w_\ell ϕ_\ell(s)$, where $ϕ_i(z)=J_i(z)$, $i=0,\ldots$, are the Bessel functions. In this work, we study an expansion for a more general class of basis functions. More precisely, we assume that the basis functions satisfy an infinite dimensional linear ordinary differential equation associated with a Hessenberg matrix, motivated by the fact that these basis functions occur in certain iterative methods. A procedure to compute the basis functions as well as the coefficients is proposed. Theoretical properties of the expansion are studied. We illustrate that non-standard basis functions can give faster convergence than the Bessel functions.

CRMay 5
Membership Inference Attacks for Retrieval Based In-Context Learning for Document Question Answering

Tejas Kulkarni, Antti Koskela, Laith Zumot

We show that remotely hosted applications employing in-context learning when augmented with a retrieval function to select in-context examples can be vulnerable to membership-inference attacks even when the service provider and users are separate parties. We propose two black-box membership inference attacks that exploit query text prefixes to distinguish member from non-member inputs. The first attack uses a reference model to estimate an otherwise unavailable loss metric. The second attack improves upon it by eliminating the reference model and instead computing a membership statistic through a simple but novel weighted-averaging scheme. Our comprehensive empirical evaluations consider a stricter case in which the adversary has a paraphrased version of the text in the queries and show that our attacks can exhibit stronger resilience to paraphrasing and outperform three prior attacks in many cases with small number of prefixes. We also adapt an existing ensemble prompting defense to our setting, demonstrating that it substantially mitigates the privacy leakage caused by our second attack.

MLJun 7, 2019Code
Computing Tight Differential Privacy Guarantees Using FFT

Antti Koskela, Joonas Jälkö, Antti Honkela

Differentially private (DP) machine learning has recently become popular. The privacy loss of DP algorithms is commonly reported using $(\varepsilon,δ)$-DP. In this paper, we propose a numerical accountant for evaluating the privacy loss for algorithms with continuous one dimensional output. This accountant can be applied to the subsampled multidimensional Gaussian mechanism which underlies the popular DP stochastic gradient descent. The proposed method is based on a numerical approximation of an integral formula which gives the exact $(\varepsilon,δ)$-values. The approximation is carried out by discretising the integral and by evaluating discrete convolutions using the fast Fourier transform algorithm. We give both theoretical error bounds and numerical error estimates for the approximation. Experimental comparisons with state-of-the-art techniques demonstrate significant improvements in bound tightness and/or computation time. Python code for the method can be found in Github (https://github.com/DPBayes/PLD-Accountant/).

LGNov 6, 2025
Differentially Private In-Context Learning with Nearest Neighbor Search

Antti Koskela, Tejas Kulkarni, Laith Zumot

Differentially private in-context learning (DP-ICL) has recently become an active research topic due to the inherent privacy risks of in-context learning. However, existing approaches overlook a critical component of modern large language model (LLM) pipelines: the similarity search used to retrieve relevant context data. In this work, we introduce a DP framework for in-context learning that integrates nearest neighbor search of relevant examples in a privacy-aware manner. Our method outperforms existing baselines by a substantial margin across all evaluated benchmarks, achieving more favorable privacy-utility trade-offs. To achieve this, we employ nearest neighbor retrieval from a database of context data, combined with a privacy filter that tracks the cumulative privacy cost of selected samples to ensure adherence to a central differential privacy budget. Experimental results on text classification and document question answering show a clear advantage of the proposed method over existing baselines.

LGDec 31, 2023
Improving the Privacy and Practicality of Objective Perturbation for Differentially Private Linear Learners

Rachel Redberg, Antti Koskela, Yu-Xiang Wang

In the arena of privacy-preserving machine learning, differentially private stochastic gradient descent (DP-SGD) has outstripped the objective perturbation mechanism in popularity and interest. Though unrivaled in versatility, DP-SGD requires a non-trivial privacy overhead (for privately tuning the model's hyperparameters) and a computational complexity which might be extravagant for simple models such as linear and logistic regression. This paper revamps the objective perturbation mechanism with tighter privacy analyses and new computational tools that boost it to perform competitively with DP-SGD on unconstrained convex generalized linear problems.

CRFeb 9, 2024
Privacy Profiles for Private Selection

Antti Koskela, Rachel Redberg, Yu-Xiang Wang

Private selection mechanisms (e.g., Report Noisy Max, Sparse Vector) are fundamental primitives of differentially private (DP) data analysis with wide applications to private query release, voting, and hyperparameter tuning. Recent work (Liu and Talwar, 2019; Papernot and Steinke, 2022) has made significant progress in both generalizing private selection mechanisms and tightening their privacy analysis using modern numerical privacy accounting tools, e.g., Rényi DP. But Rényi DP is known to be lossy when $(ε,δ)$-DP is ultimately needed, and there is a trend to close the gap by directly handling privacy profiles, i.e., $δ$ as a function of $ε$ or its equivalent dual form known as $f$-DPs. In this paper, we work out an easy-to-use recipe that bounds the privacy profiles of ReportNoisyMax and PrivateTuning using the privacy profiles of the base algorithms they corral. Numerically, our approach improves over the RDP-based accounting in all regimes of interest and leads to substantial benefits in end-to-end private learning experiments. Our analysis also suggests new distributions, e.g., binomial distribution for randomizing the number of rounds that leads to more substantial improvements in certain regimes.

SIMay 9, 2025
On the Price of Differential Privacy for Spectral Clustering over Stochastic Block Models

Antti Koskela, Mohamed Seif, Andrea J. Goldsmith

We investigate privacy-preserving spectral clustering for community detection within stochastic block models (SBMs). Specifically, we focus on edge differential privacy (DP) and propose private algorithms for community recovery. Our work explores the fundamental trade-offs between the privacy budget and the accurate recovery of community labels. Furthermore, we establish information-theoretic conditions that guarantee the accuracy of our methods, providing theoretical assurances for successful community recovery under edge DP.

ITOct 8, 2025
Spectral Graph Clustering under Differential Privacy: Balancing Privacy, Accuracy, and Efficiency

Mohamed Seif, Antti Koskela, H. Vincent Poor et al.

We study the problem of spectral graph clustering under edge differential privacy (DP). Specifically, we develop three mechanisms: (i) graph perturbation via randomized edge flipping combined with adjacency matrix shuffling, which enforces edge privacy while preserving key spectral properties of the graph. Importantly, shuffling considerably amplifies the guarantees: whereas flipping edges with a fixed probability alone provides only a constant epsilon edge DP guarantee as the number of nodes grows, the shuffled mechanism achieves (epsilon, delta) edge DP with parameters that tend to zero as the number of nodes increase; (ii) private graph projection with additive Gaussian noise in a lower-dimensional space to reduce dimensionality and computational complexity; and (iii) a noisy power iteration method that distributes Gaussian noise across iterations to ensure edge DP while maintaining convergence. Our analysis provides rigorous privacy guarantees and a precise characterization of the misclassification error rate. Experiments on synthetic and real-world networks validate our theoretical analysis and illustrate the practical privacy-utility trade-offs.

DCDec 11, 2024
Protecting Confidentiality, Privacy and Integrity in Collaborative Learning

Dong Chen, Alice Dethise, Istemi Ekin Akkus et al.

A collaboration between dataset owners and model owners is needed to facilitate effective machine learning (ML) training. During this collaboration, however, dataset owners and model owners want to protect the confidentiality of their respective assets (i.e., datasets, models and training code), with the dataset owners also caring about the privacy of individual users whose data is in their datasets. Existing solutions either provide limited confidentiality for models and training code, or suffer from privacy issues due to collusion. We present Citadel++, a collaborative ML training system designed to simultaneously protect the confidentiality of datasets, models and training code as well as the privacy of individual users. Citadel++ enhances differential privacy mechanisms to safeguard the privacy of individual user data while maintaining model utility. By employing Virtual Machine-level Trusted Execution Environments (TEEs) as well as the improved sandboxing and integrity mechanisms through OS-level techniques, Citadel++ effectively preserves the confidentiality of datasets, models and training code, and enforces our privacy mechanisms even when the models and training code have been maliciously designed. Our experiments show that Citadel++ provides model utility and performance while adhering to the confidentiality and privacy requirements of dataset owners and model owners, outperforming the state-of-the-art privacy-preserving training systems by up to 543x on CPU and 113x on GPU TEEs.

LGJun 7, 2024
Auditing Differential Privacy Guarantees Using Density Estimation

Antti Koskela, Jafar Mohammadi

We present a novel method for accurately auditing the differential privacy (DP) guarantees of DP mechanisms. In particular, our solution is applicable to auditing DP guarantees of machine learning (ML) models. Previous auditing methods tightly capture the privacy guarantees of DP-SGD trained models in the white-box setting where the auditor has access to all intermediate models; however, the success of these methods depends on a priori information about the parametric form of the noise and the subsampling ratio used for sampling the gradients. We present a method that does not require such information and is agnostic to the randomization used for the underlying mechanism. Similarly to several previous DP auditing methods, we assume that the auditor has access to a set of independent observations from two one-dimensional distributions corresponding to outputs from two neighbouring datasets. Furthermore, our solution is based on a simple histogram-based density estimation technique to find lower bounds for the statistical distance between these distributions when measured using the hockey-stick divergence. We show that our approach also naturally generalizes the previously considered class of threshold membership inference auditing methods. We improve upon accurate auditing methods such as the $f$-DP auditing. Moreover, we address an open problem on how to accurately audit the subsampled Gaussian mechanism without any knowledge of the parameters of the underlying mechanism.

COJun 17, 2021
Differentially Private Hamiltonian Monte Carlo

Ossi Räisä, Antti Koskela, Antti Honkela

Markov chain Monte Carlo (MCMC) algorithms have long been the main workhorses of Bayesian inference. Among them, Hamiltonian Monte Carlo (HMC) has recently become very popular due to its efficiency resulting from effective use of the gradients of the target distribution. In privacy-preserving machine learning, differential privacy (DP) has become the gold standard in ensuring that the privacy of data subjects is not violated. Existing DP MCMC algorithms either use random-walk proposals, or do not use the Metropolis--Hastings (MH) acceptance test to ensure convergence without decreasing their step size to zero. We present a DP variant of HMC using the MH acceptance test that builds on a recently proposed DP MCMC algorithm called the penalty algorithm, and adds noise to the gradient evaluations of HMC. We prove that the resulting algorithm converges to the correct distribution, and is ergodic. We compare DP-HMC with the existing penalty, DP-SGLD and DP-SGNHT algorithms, and find that DP-HMC has better or equal performance than the penalty algorithm, and performs more consistently than DP-SGLD or DP-SGNHT.

CRJun 1, 2021
Tight Accounting in the Shuffle Model of Differential Privacy

Antti Koskela, Mikko A. Heikkilä, Antti Honkela

Shuffle model of differential privacy is a novel distributed privacy model based on a combination of local privacy mechanisms and a secure shuffler. It has been shown that the additional randomisation provided by the shuffler improves privacy bounds compared to the purely local mechanisms. Accounting tight bounds, however, is complicated by the complexity brought by the shuffler. The recently proposed numerical techniques for evaluating $(\varepsilon,δ)$-differential privacy guarantees have been shown to give tighter bounds than commonly used methods for compositions of various complex mechanisms. In this paper, we show how to obtain accurate bounds for adaptive compositions of general $\varepsilon$-LDP shufflers using the analysis by Feldman et al. (2021) and tight bounds for adaptive compositions of shufflers of $k$-randomised response mechanisms, using the analysis by Balle et al. (2019). We show how to speed up the evaluation of the resulting privacy loss distribution from $\mathcal{O}(n^2)$ to $\mathcal{O}(n)$, where $n$ is the number of users, without noticeable change in the resulting $δ(\varepsilon)$-upper bounds. We also demonstrate looseness of the existing bounds and methods found in the literature, improving previous composition results significantly.

CRFeb 24, 2021
Computing Differential Privacy Guarantees for Heterogeneous Compositions Using FFT

Antti Koskela, Antti Honkela

The recently proposed Fast Fourier Transform (FFT)-based accountant for evaluating $(\varepsilon,δ)$-differential privacy guarantees using the privacy loss distribution formalism has been shown to give tighter bounds than commonly used methods such as Rényi accountants when applied to homogeneous compositions, i.e., to compositions of identical mechanisms. In this paper, we extend this approach to heterogeneous compositions. We carry out a full error analysis that allows choosing the parameters of the algorithm such that a desired accuracy is obtained. The analysis also extends previous results by taking into account all the parameters of the algorithm. Using the error analysis, we also give a bound for the computational complexity in terms of the error which is analogous to and slightly tightens the one given by Murtagh and Vadhan (2018). We also show how to speed up the evaluation of tight privacy guarantees using the Plancherel theorem at the cost of increased pre-computation and memory usage.

LGNov 1, 2020
Differentially Private Bayesian Inference for Generalized Linear Models

Tejas Kulkarni, Joonas Jälkö, Antti Koskela et al.

Generalized linear models (GLMs) such as logistic regression are among the most widely used arms in data analyst's repertoire and often used on sensitive datasets. A large body of prior works that investigate GLMs under differential privacy (DP) constraints provide only private point estimates of the regression coefficients, and are not able to quantify parameter uncertainty. In this work, with logistic and Poisson regression as running examples, we introduce a generic noise-aware DP Bayesian inference method for a GLM at hand, given a noisy sum of summary statistics. Quantifying uncertainty allows us to determine which of the regression coefficients are statistically significantly different from zero. We provide a previously unknown tight privacy analysis and experimentally demonstrate that the posteriors obtained from our model, while adhering to strong privacy guarantees, are close to the non-private posteriors.

CRJul 10, 2020
Differentially private cross-silo federated learning

Mikko A. Heikkilä, Antti Koskela, Kana Shimizu et al.

Strict privacy is of paramount importance in distributed machine learning. Federated learning, with the main idea of communicating only what is needed for learning, has been recently introduced as a general approach for distributed learning to enhance learning and improve security. However, federated learning by itself does not guarantee any privacy for data subjects. To quantify and control how much privacy is compromised in the worst-case, we can use differential privacy. In this paper we combine additively homomorphic secure summation protocols with differential privacy in the so-called cross-silo federated learning setting. The goal is to learn complex models like neural networks while guaranteeing strict privacy for the individual data subjects. We demonstrate that our proposed solutions give prediction accuracy that is comparable to the non-distributed setting, and are fast enough to enable learning models with millions of parameters in a reasonable time. To enable learning under strict privacy guarantees that need privacy amplification by subsampling, we present a general algorithm for oblivious distributed subsampling. However, we also argue that when malicious parties are present, a simple approach using distributed Poisson subsampling gives better privacy. Finally, we show that by leveraging random projections we can further scale-up our approach to larger models while suffering only a modest performance loss.

MLJun 12, 2020
Tight Differential Privacy for Discrete-Valued Mechanisms and for the Subsampled Gaussian Mechanism Using FFT

Antti Koskela, Joonas Jälkö, Lukas Prediger et al.

We propose a numerical accountant for evaluating the tight $(\varepsilon,δ)$-privacy loss for algorithms with discrete one dimensional output. The method is based on the privacy loss distribution formalism and it uses the recently introduced fast Fourier transform based accounting technique. We carry out an error analysis of the method in terms of moment bounds of the privacy loss distribution which leads to rigorous lower and upper bounds for the true $(\varepsilon,δ)$-values. As an application, we present a novel approach to accurate privacy accounting of the subsampled Gaussian mechanism. This completes the previously proposed analysis by giving strict lower and upper bounds for the privacy parameters. We demonstrate the performance of the accountant on the binomial mechanism and show that our approach allows decreasing noise variance up to 75 percent at equal privacy compared to existing bounds in the literature. We also illustrate how to compute tight bounds for the exponential mechanism applied to counting queries.

MLSep 11, 2018
Learning Rate Adaptation for Federated and Differentially Private Learning

Antti Koskela, Antti Honkela

We propose an algorithm for the adaptation of the learning rate for stochastic gradient descent (SGD) that avoids the need for validation set use. The idea for the adaptiveness comes from the technique of extrapolation: to get an estimate for the error against the gradient flow which underlies SGD, we compare the result obtained by one full step and two half-steps. The algorithm is applied in two separate frameworks: federated and differentially private learning. Using examples of deep neural networks we empirically show that the adaptive algorithm is competitive with manually tuned commonly used optimisation methods for differentially privately training. We also show that it works robustly in the case of federated learning unlike commonly used optimisation methods.