DCSep 29, 2023
Parallel Computation of Multi-Slice Clustering of Third-Order TensorsDina Faneva Andriantsiory, Camille Coti, Joseph Ben Geloun et al.
Machine Learning approaches like clustering methods deal with massive datasets that present an increasing challenge. We devise parallel algorithms to compute the Multi-Slice Clustering (MSC) for 3rd-order tensors. The MSC method is based on spectral analysis of the tensor slices and works independently on each tensor mode. Such features fit well in the parallel paradigm via a distributed memory system. We show that our parallel scheme outperforms sequential computing and allows for the scalability of the MSC method.
DCAug 2, 2016
Parametric, Probabilistic, Timed Resource Discovery SystemCamille Coti
This paper presents a fully distributed resource discovery and reservation system. Verification of such a system is important to ensure the execution of distributed applications on a set of resources in appropriate conditions. A semi-formal model for his system is presented using probabilistic timed automata. This model is timed, parametric and probabilistic, making it a challenge to the parameter synthesis community.
DCDec 14, 2009
QR Factorization of Tall and Skinny Matrices in a Grid Computing EnvironmentEmmanuel Agullo, Camille Coti, Jack Dongarra et al.
Previous studies have reported that common dense linear algebra operations do not achieve speed up by using multiple geographical sites of a computational grid. Because such operations are the building blocks of most scientific applications, conventional supercomputers are still strongly predominant in high-performance computing and the use of grids for speeding up large-scale scientific problems is limited to applications exhibiting parallelism at a higher level. We have identified two performance bottlenecks in the distributed memory algorithms implemented in ScaLAPACK, a state-of-the-art dense linear algebra library. First, because ScaLAPACK assumes a homogeneous communication network, the implementations of ScaLAPACK algorithms lack locality in their communication pattern. Second, the number of messages sent in the ScaLAPACK algorithms is significantly greater than other algorithms that trade flops for communication. In this paper, we present a new approach for computing a QR factorization -- one of the main dense linear algebra kernels -- of tall and skinny matrices in a grid computing environment that overcomes these two bottlenecks. Our contribution is to articulate a recently proposed algorithm (Communication Avoiding QR) with a topology-aware middleware (QCG-OMPI) in order to confine intensive communications (ScaLAPACK calls) within the different geographical sites. An experimental study conducted on the Grid'5000 platform shows that the resulting performance increases linearly with the number of geographical sites on large-scale problems (and is in particular consistently higher than ScaLAPACK's).