Mohammad-Amin Charusaie

LG
4papers
88citations
Novelty53%
AI Score28

4 Papers

LGJul 19, 2022
Sample Efficient Learning of Predictors that Complement Humans

Mohammad-Amin Charusaie, Hussein Mozannar, David Sontag et al. · microsoft-research

One of the goals of learning algorithms is to complement and reduce the burden on human decision makers. The expert deferral setting wherein an algorithm can either predict on its own or defer the decision to a downstream expert helps accomplish this goal. A fundamental aspect of this setting is the need to learn complementary predictors that improve on the human's weaknesses rather than learning predictors optimized for average error. In this work, we provide the first theoretical analysis of the benefit of learning complementary predictors in expert deferral. To enable efficiently learning such predictors, we consider a family of consistent surrogate loss functions for expert deferral and analyze their theoretical properties. Finally, we design active learning schemes that require minimal amount of data of human expert predictions in order to learn accurate deferral systems.

LGJul 17, 2024
A Unifying Post-Processing Framework for Multi-Objective Learn-to-Defer Problems

Mohammad-Amin Charusaie, Samira Samadi

Learn-to-Defer is a paradigm that enables learning algorithms to work not in isolation but as a team with human experts. In this paradigm, we permit the system to defer a subset of its tasks to the expert. Although there are currently systems that follow this paradigm and are designed to optimize the accuracy of the final human-AI team, the general methodology for developing such systems under a set of constraints (e.g., algorithmic fairness, expert intervention budget, defer of anomaly, etc.) remains largely unexplored. In this paper, using a $d$-dimensional generalization to the fundamental lemma of Neyman and Pearson (d-GNP), we obtain the Bayes optimal solution for learn-to-defer systems under various constraints. Furthermore, we design a generalizable algorithm to estimate that solution and apply this algorithm to the COMPAS and ACSIncome datasets. Our algorithm shows improvements in terms of constraint violation over a set of baselines.

LGJun 9, 2021
Hermite Polynomial Features for Private Data Generation

Margarita Vinaroz, Mohammad-Amin Charusaie, Frederik Harder et al.

Kernel mean embedding is a useful tool to represent and compare probability measures. Despite its usefulness, kernel mean embedding considers infinite-dimensional features, which are challenging to handle in the context of differentially private data generation. A recent work proposes to approximate the kernel mean embedding of data distribution using finite-dimensional random features, which yields analytically tractable sensitivity. However, the number of required random features is excessively high, often ten thousand to a hundred thousand, which worsens the privacy-accuracy trade-off. To improve the trade-off, we propose to replace random features with Hermite polynomial features. Unlike the random features, the Hermite polynomial features are ordered, where the features at the low orders contain more information on the distribution than those at the high orders. Hence, a relatively low order of Hermite polynomial features can more accurately approximate the mean embedding of the data distribution compared to a significantly higher number of random features. As demonstrated on several tabular and image datasets, Hermite polynomial features seem better suited for private data generation than random Fourier features.

ITJan 12, 2020
Compressibility Measures for Affinely Singular Random Vectors

Mohammad-Amin Charusaie, Arash Amini, Stefano Rini

There are several ways to measure the compressibility of a random measure; they include general approaches such as using the rate-distortion curve, as well as more specific notions, such as the Renyi information dimension (RID). The RID parameter indicates the concentration of the measure around lower-dimensional subsets of the space. While the evaluation of such compressibility parameters is well-studied for continuous and discrete measures, the case of discrete-continuous measures is quite subtle. In this paper, we focus on a class of multi-dimensional random measures that have singularities on affine lower-dimensional subsets. This class of distributions naturally arises when considering linear transformation of component-wise independent discrete-continuous random variables. To measure the compressibility of such distributions, we introduce the new notion of dimensional-rate bias (DRB) which is closely related to the entropy and differential entropy in discrete and continuous cases, respectively. Similar to entropy and differential entropy, DRB is useful in evaluating the mutual information between distributions of the aforementioned type. Besides the DRB, we also evaluate the the RID of these distributions. We further provide an upper-bound for the RID of multi-dimensional random measures that are obtained by Lipschitz functions of component-wise independent discrete-continuous random variables ($\mathbf{X}$). The upper-bound is shown to be achievable when the Lipschitz function is $A \mathbf{X}$, where $A$ satisfies {\changed$\spark({A_{m\times n}}) = m+1$} (e.g., Vandermonde matrices). When considering discrete-domain moving-average processes with non-Gaussian excitation noise, the above results allow us to evaluate the block-average RID and DRB, as well as to determine a relationship between these parameters and other existing compressibility measures.