Neophytos Charalambides

h-index4

5papers

22citations

Novelty25%

AI Score33

Ranked #118,233 of 194,257 authors (top 61%)#389 in IT (top 51%)

5 Papers

5.1ITAug 6, 2023

Gradient Coding with Iterative Block Leverage Score Sampling

Neophytos Charalambides, Mert Pilanci, Alfred Hero

We generalize the leverage score sampling sketch for $\ell_2$-subspace embeddings, to accommodate sampling subsets of the transformed data, so that the sketching approach is appropriate for distributed settings. This is then used to derive an approximate coded computing approach for first-order methods; known as gradient coding, to accelerate linear regression in the presence of failures in distributed computational networks, \textit{i.e.} stragglers. We replicate the data across the distributed network, to attain the approximation guarantees through the induced sampling distribution. The significance and main contribution of this work, is that it unifies randomized numerical linear algebra with approximate coded computing, while attaining an induced $\ell_2$-subspace embedding through uniform sampling. The transition to uniform sampling is done without applying a random projection, as in the case of the subsampled randomized Hadamard transform. Furthermore, by incorporating this technique to coded computing, our scheme is an iterative sketching approach to approximately solving linear regression. We also propose weighting when sketching takes place through sampling with replacement, for further compression.

3.3ITAug 8, 2023

Iterative Sketching for Secure Coded Regression

Neophytos Charalambides, Hessam Mahdavifar, Mert Pilanci et al.

Linear regression is a fundamental and primitive problem in supervised machine learning, with applications ranging from epidemiology to finance. In this work, we propose methods for speeding up distributed linear regression. We do so by leveraging randomized techniques, while also ensuring security and straggler resiliency in asynchronous distributed computing systems. Specifically, we randomly rotate the basis of the system of equations and then subsample blocks, to simultaneously secure the information and reduce the dimension of the regression problem. In our setup, the basis rotation corresponds to an encoded encryption in an approximate gradient coding scheme, and the subsampling corresponds to the responses of the non-straggling servers in the centralized coded computing framework. This results in a distributive iterative stochastic approach for matrix compression and steepest descent.

5.3DCMay 16

Approximate Distributed Coded Computing: Polynomial Codes and Randomized Sketching

Neophytos Charalambides, Arya Mazumdar

Coded computing is a distributed paradigm that uses coding theory to introduce \textit{redundancy} and overcome bottlenecks in large-scale systems. In the same vein, randomized numerical linear algebra employs probabilistic methods to \textit{compress} and accelerate linear algebraic operations, addressing challenges in high-dimensional data analysis. This article reviews the foundations of both fields and presents distributed schemes that combine techniques from both to speed up optimization and machine learning algorithms, in the presence of slow or non-responsive servers. Along the way, we touch on various related topics and mathematical concepts.

3.3ITJan 21, 2022

Orthonormal Sketches for Secure Coded Regression

Neophytos Charalambides, Hessam Mahdavifar, Mert Pilanci et al.

In this work, we propose a method for speeding up linear regression distributively, while ensuring security. We leverage randomized sketching techniques, and improve straggler resilience in asynchronous systems. Specifically, we apply a random orthonormal matrix and then subsample in \textit{blocks}, to simultaneously secure the information and reduce the dimension of the regression problem. In our setup, the transformation corresponds to an encoded encryption in an \textit{approximate} gradient coding scheme, and the subsampling corresponds to the responses of the non-straggling workers; in a centralized coded computing network. We focus on the special case of the \textit{Subsampled Randomized Hadamard Transform}, which we generalize to block sampling; and discuss how it can be used to secure the data. We illustrate the performance through numerical experiments.

2.3LGJul 26, 2020

Dimensionality Reduction for $k$-means Clustering

Neophytos Charalambides

We present a study on how to effectively reduce the dimensions of the $k$-means clustering problem, so that provably accurate approximations are obtained. Four algorithms are presented, two \textit{feature selection} and two \textit{feature extraction} based algorithms, all of which are randomized.