Monika Henzinger

h-index44

14papers

246citations

Novelty59%

AI Score47

Ranked #30,920 of 194,257 authors (top 16%)#90 in DS (top 19%)

14 Papers

18.1LGNov 9, 2022

Almost Tight Error Bounds on Differentially Private Continual Counting

Monika Henzinger, Jalaj Upadhyay, Sarvagya Upadhyay

The first large-scale deployment of private federated learning uses differentially private counting in the continual release model as a subroutine (Google AI blog titled "Federated Learning with Formal Differential Privacy Guarantees"). In this case, a concrete bound on the error is very relevant to reduce the privacy parameter. The standard mechanism for continual counting is the binary mechanism. We present a novel mechanism and show that its mean squared error is both asymptotically optimal and a factor 10 smaller than the error of the binary mechanism. We also show that the constants in our analysis are almost tight by giving non-asymptotic lower and upper bounds that differ only in the constants of lower-order terms. Our algorithm is a matrix mechanism for the counting matrix and takes constant time per release. We also use our explicit factorization of the counting matrix to give an upper bound on the excess risk of the private learning algorithm of Denisov et al. (NeurIPS 2022). Our lower bound for any continual counting mechanism is the first tight lower bound on continual counting under approximate differential privacy. It is achieved using a new lower bound on a certain factorization norm, denoted by $γ_F(\cdot)$, in terms of the singular values of the matrix. In particular, we show that for any complex matrix, $A \in \mathbb{C}^{m \times n}$, \[ γ_F(A) \geq \frac{1}{\sqrt{m}}\|A\|_1, \] where $\|\cdot \|$ denotes the Schatten-1 norm. We believe this technique will be useful in proving lower bounds for a larger class of linear queries. To illustrate the power of this technique, we show the first lower bound on the mean squared error for answering parity queries.

16.0LGJul 18, 2023

A Unifying Framework for Differentially Private Sums under Continual Observation

Monika Henzinger, Jalaj Upadhyay, Sarvagya Upadhyay

We study the problem of maintaining a differentially private decaying sum under continual observation. We give a unifying framework and an efficient algorithm for this problem for \emph{any sufficiently smooth} function. Our algorithm is the first differentially private algorithm that does not have a multiplicative error for polynomially-decaying weights. Our algorithm improves on all prior works on differentially private decaying sums under continual observation and recovers exactly the additive error for the special case of continual counting from Henzinger et al. (SODA 2023) as a corollary. Our algorithm is a variant of the factorization mechanism whose error depends on the $γ_2$ and $γ_F$ norm of the underlying matrix. We give a constructive proof for an almost exact upper bound on the $γ_2$ and $γ_F$ norm and an almost tight lower bound on the $γ_2$ norm for a large class of lower-triangular matrices. This is the first non-trivial lower bound for lower-triangular matrices whose non-zero entries are not all the same. It includes matrices for all continual decaying sums problems, resulting in an upper bound on the additive error of any differentially private decaying sums algorithm under continual observation. We also explore some implications of our result in discrepancy theory and operator algebra. Given the importance of the $γ_2$ norm in computer science and the extensive work in mathematics, we believe our result will have further applications.

3.8LGOct 25, 2023

Simple, Scalable and Effective Clustering via One-Dimensional Projections

Moses Charikar, Monika Henzinger, Lunjia Hu et al.

Clustering is a fundamental problem in unsupervised machine learning with many applications in data analysis. Popular clustering algorithms such as Lloyd's algorithm and $k$-means++ can take $Ω(ndk)$ time when clustering $n$ points in a $d$-dimensional space (represented by an $n\times d$ matrix $X$) into $k$ clusters. In applications with moderate to large $k$, the multiplicative $k$ factor can become very expensive. We introduce a simple randomized clustering algorithm that provably runs in expected time $O(\mathrm{nnz}(X) + n\log n)$ for arbitrary $k$. Here $\mathrm{nnz}(X)$ is the total number of non-zero entries in the input dataset $X$, which is upper bounded by $nd$ and can be significantly smaller for sparse datasets. We prove that our algorithm achieves approximation ratio $\smash{\widetilde{O}(k^4)}$ on any input dataset for the $k$-means objective. We also believe that our theoretical analysis is of independent interest, as we show that the approximation ratio of a $k$-means algorithm is approximately preserved under a class of projections and that $k$-means++ seeding can be implemented in expected $O(n \log n)$ time in one dimension. Finally, we show experimentally that our clustering algorithm gives a new tradeoff between running time and cluster quality compared to previous state-of-the-art methods for these tasks.

8.0DSMar 16

Concurrent Composition for Differentially Private Continual Mechanisms

Monika Henzinger, Roodabeh Safavi, Salil Vadhan

Many intended uses of differential privacy involve a $\textit{continual mechanism}$ that is set up to run continuously over a long period of time, making more statistical releases as either queries come in or the dataset is updated. In this paper, we give the first general treatment of privacy against $\textit{adaptive}$ adversaries for mechanisms that support dataset updates and a variety of queries, all arbitrarily interleaved. It also models a very general notion of neighboring, that includes both event-level and user-level privacy. We prove several $\textit{concurrent}$ composition theorems for continual mechanisms, which ensure privacy even when an adversary can interleave queries and dataset updates to the different composed mechanisms. Previous concurrent composition theorems for differential privacy were only for the case when the dataset is static, with no adaptive updates. Moreover, we also give the first interactive and continual generalizations of the "parallel composition theorem" for noninteractive differential privacy. Specifically, we show that the analogue of the noninteractive parallel composition theorem holds if either there are no adaptive dataset updates or each of the composed mechanisms satisfies pure differential privacy, but it fails to hold for composing approximately differentially private mechanisms with dataset updates. We then formalize a set of general conditions on a continual mechanism $M$ that runs multiple continual sub-mechanisms such that the privacy guarantees of $M$ follow directly using the above concurrent composition theorems on the sub-mechanisms, without further privacy loss. This enables us to give a simpler and more modular privacy analysis of a recent continual histogram mechanism of Henzinger, Sricharan, and Steiner. In the case of approximate DP, ours is the first proof showing that its privacy holds against adaptive adversaries.

2.3DSJul 7, 2023

Differential Privacy for Clustering Under Continual Observation

Max Dupré la Tour, Monika Henzinger, David Saulpic

We consider the problem of clustering privately a dataset in $\mathbb{R}^d$ that undergoes both insertion and deletion of points. Specifically, we give an $\varepsilon$-differentially private clustering mechanism for the $k$-means objective under continual observation. This is the first approximation algorithm for that problem with an additive error that depends only logarithmically in the number $T$ of updates. The multiplicative error is almost the same as non privately. To do so we show how to perform dimension reduction under continual observation and combine it with a differentially private greedy approximation algorithm for $k$-means. We also partially extend our results to the $k$-median problem.

16.4LGFeb 27, 2024

Data-Efficient Learning via Clustering-Based Sensitivity Sampling: Foundation Models and Beyond

Kyriakos Axiotis, Vincent Cohen-Addad, Monika Henzinger et al.

We study the data selection problem, whose aim is to select a small representative subset of data that can be used to efficiently train a machine learning model. We present a new data selection approach based on $k$-means clustering and sensitivity sampling. Assuming access to an embedding representation of the data with respect to which the model loss is Hölder continuous, our approach provably allows selecting a set of ``typical'' $k + 1/\varepsilon^2$ elements whose average loss corresponds to the average loss of the whole dataset, up to a multiplicative $(1\pm\varepsilon)$ factor and an additive $\varepsilon λΦ_k$, where $Φ_k$ represents the $k$-means cost for the input embeddings and $λ$ is the Hölder constant. We furthermore demonstrate the performance and scalability of our approach on fine-tuning foundation models and show that it outperforms state-of-the-art methods. We also show how it can be applied on linear regression, leading to a new sampling strategy that surprisingly matches the performances of leverage score sampling, while being conceptually simpler and more scalable.

5.1DSApr 6, 2025

Binned Group Algebra Factorization for Differentially Private Continual Counting

Monika Henzinger, Nikita P. Kalinin, Jalaj Upadhyay

We study memory-efficient matrix factorization for differentially private counting under continual observation. While recent work by Henzinger and Upadhyay 2024 introduced a factorization method with reduced error based on group algebra, its practicality in streaming settings remains limited by computational constraints. We present new structural properties of the group algebra factorization, enabling the use of a binning technique from Andersson and Pagh (2024). By grouping similar values in rows, the binning method reduces memory usage and running time to $\tilde O(\sqrt{n})$, where $n$ is the length of the input stream, while maintaining a low error. Our work bridges the gap between theoretical improvements in factorization accuracy and practical efficiency in large-scale private learning systems.

22.6LGJun 9, 2025

Krishna Pillutla, Jalaj Upadhyay, Christopher A. Choquette-Choo et al.

This monograph explores the design and analysis of correlated noise mechanisms for differential privacy (DP), focusing on their application to private training of AI and machine learning models via the core primitive of estimation of weighted prefix sums. While typical DP mechanisms inject independent noise into each step of a stochastic gradient (SGD) learning algorithm in order to protect the privacy of the training data, a growing body of recent research demonstrates that introducing (anti-)correlations in the noise can significantly improve privacy-utility trade-offs by carefully canceling out some of the noise added on earlier steps in subsequent steps. Such correlated noise mechanisms, known variously as matrix mechanisms, factorization mechanisms, and DP-Follow-the-Regularized-Leader (DP-FTRL) when applied to learning algorithms, have also been influential in practice, with industrial deployment at a global scale.

6.6DSSep 17, 2025

Normalized Square Root: Sharper Matrix Factorization Bounds for Differentially Private Continual Counting

Monika Henzinger, Nikita P. Kalinin, Jalaj Upadhyay

The factorization norms of the lower-triangular all-ones $n \times n$ matrix, $γ_2(M_{count})$ and $γ_{F}(M_{count})$, play a central role in differential privacy as they are used to give theoretical justification of the accuracy of the only known production-level private training algorithm of deep neural networks by Google. Prior to this work, the best known upper bound on $γ_2(M_{count})$ was $1 + \frac{\log n}π$ by Mathias (Linear Algebra and Applications, 1993), and the best known lower bound was $\frac{1}π(2 + \log(\frac{2n+1}{3})) \approx 0.507 + \frac{\log n}π$ (Matoušek, Nikolov, Talwar, IMRN 2020), where $\log$ denotes the natural logarithm. Recently, Henzinger and Upadhyay (SODA 2025) gave the first explicit factorization that meets the bound of Mathias (1993) and asked whether there exists an explicit factorization that improves on Mathias' bound. We answer this question in the affirmative. Additionally, we improve the lower bound significantly. More specifically, we show that $$ 0.701 + \frac{\log n}π + o(1) \;\leq\; γ_2(M_{count}) \;\leq\; 0.846 + \frac{\log n}π + o(1). $$ That is, we reduce the gap between the upper and lower bound to $0.14 + o(1)$. We also show that our factors achieve a better upper bound for $γ_{F}(M_{count})$ compared to prior work, and we establish an improved lower bound: $$ 0.701 + \frac{\log n}π + o(1) \;\leq\; γ_{F}(M_{count}) \;\leq\; 0.748 + \frac{\log n}π + o(1). $$ That is, the gap between the lower and upper bound provided by our explicit factorization is $0.047 + o(1)$.

5.1DSJun 17, 2024

Making Old Things New: A Unified Algorithm for Differentially Private Clustering

Max Dupré la Tour, Monika Henzinger, David Saulpic

As a staple of data analysis and unsupervised learning, the problem of private clustering has been widely studied under various privacy models. Centralized differential privacy is the first of them, and the problem has also been studied for the local and the shuffle variation. In each case, the goal is to design an algorithm that computes privately a clustering, with the smallest possible error. The study of each variation gave rise to new algorithms: the landscape of private clustering algorithms is therefore quite intricate. In this paper, we show that a 20-year-old algorithm can be slightly modified to work for any of these models. This provides a unified picture: while matching almost all previously known results, it allows us to improve some of them and extend it to a new privacy model, the continual observation setting, where the input is changing over time and the algorithm must output a new solution at each time step.

12.6DSJun 28, 2021

Differentially Private Algorithms for Graphs Under Continual Observation

Hendrik Fichtenberger, Monika Henzinger, Lara Ost

Differentially private algorithms protect individuals in data analysis scenarios by ensuring that there is only a weak correlation between the existence of the user in the data and the result of the analysis. Dynamic graph algorithms maintain the solution to a problem (e.g., a matching) on an evolving input, i.e., a graph where nodes or edges are inserted or deleted over time. They output the value of the solution after each update operation, i.e., continuously. We study (event-level and user-level) differentially private algorithms for graph problems under continual observation, i.e., differentially private dynamic graph algorithms. We present event-level private algorithms for partially dynamic counting-based problems such as triangle count that improve the additive error by a polynomial factor (in the length $T$ of the update sequence) on the state of the art, resulting in the first algorithms with additive error polylogarithmic in $T$. We also give $\varepsilon$-differentially private and partially dynamic algorithms for minimum spanning tree, minimum cut, densest subgraph, and maximum matching. The additive error of our improved MST algorithm is $O(W \log^{3/2}T / \varepsilon)$, where $W$ is the maximum weight of any edge, which, as we show, is tight up to a $(\sqrt{\log T} / \varepsilon)$-factor. For the other problems, we present a partially-dynamic algorithm with multiplicative error $(1+β)$ for any constant $β> 0$ and additive error $O(W \log(nW) \log(T) / (\varepsilon β))$. Finally, we show that the additive error for a broad class of dynamic graph algorithms with user-level privacy must be linear in the value of the output solution's range.

2.3DSApr 19, 2018

Algorithms and Conditional Lower Bounds for Planning Problems

Krishnendu Chatterjee, Wolfgang Dvořák, Monika Henzinger et al.

We consider planning problems for graphs, Markov decision processes (MDPs), and games on graphs. While graphs represent the most basic planning model, MDPs represent interaction with nature and games on graphs represent interaction with an adversarial environment. We consider two planning problems where there are k different target sets, and the problems are as follows: (a) the coverage problem asks whether there is a plan for each individual target set, and (b) the sequential target reachability problem asks whether the targets can be reached in sequence. For the coverage problem, we present a linear-time algorithm for graphs and quadratic conditional lower bound for MDPs and games on graphs. For the sequential target problem, we present a linear-time algorithm for graphs, a sub-quadratic algorithm for MDPs, and a quadratic conditional lower bound for games on graphs. Our results with conditional lower bounds establish (i) model-separation results showing that for the coverage problem MDPs and games on graphs are harder than graphs and for the sequential reachability problem games on graphs are harder than MDPs and graphs; (ii) objective-separation results showing that for MDPs the coverage problem is harder than the sequential target problem.

6.0NEFeb 20, 2018Code

Memetic Graph Clustering

Sonja Biedermann, Monika Henzinger, Christian Schulz et al.

It is common knowledge that there is no single best strategy for graph clustering, which justifies a plethora of existing approaches. In this paper, we present a general memetic algorithm, VieClus, to tackle the graph clustering problem. This algorithm can be adapted to optimize different objective functions. A key component of our contribution are natural recombine operators that employ ensemble clusterings as well as multi-level techniques. Lastly, we combine these techniques with a scalable communication protocol, producing a system that is able to compute high-quality solutions in a short amount of time. We instantiate our scheme with local search for modularity and show that our algorithm successfully improves or reproduces all entries of the 10th DIMACS implementation~challenge under consideration using a small amount of time.

14.2DSJun 19, 2017

Capacity Releasing Diffusion for Speed and Locality

Di Wang, Kimon Fountoulakis, Monika Henzinger et al.

Diffusions and related random walk procedures are of central importance in many areas of machine learning, data analysis, and applied mathematics. Because they spread mass agnostically at each step in an iterative manner, they can sometimes spread mass "too aggressively," thereby failing to find the "right" clusters. We introduce a novel Capacity Releasing Diffusion (CRD) Process, which is both faster and stays more local than the classical spectral diffusion process. As an application, we use our CRD Process to develop an improved local algorithm for graph clustering. Our local graph clustering method can find local clusters in a model of clustering where one begins the CRD Process in a cluster whose vertices are connected better internally than externally by an $O(\log^2 n)$ factor, where $n$ is the number of nodes in the cluster. Thus, our CRD Process is the first local graph clustering algorithm that is not subject to the well-known quadratic Cheeger barrier. Our result requires a certain smoothness condition, which we expect to be an artifact of our analysis. Our empirical evaluation demonstrates improved results, in particular for realistic social graphs where there are moderately good---but not very good---clusters.