Antonio J. Peña

h-index22

3papers

58citations

Novelty48%

AI Score34

Ranked #111,997 of 194,257 authors (top 58%)#2,732 in CR (top 40%)

3 Papers

4.3QUANT-PHAug 6, 2025

Dynamic Solutions for Hybrid Quantum-HPC Resource Allocation

Roberto Rocco, Simone Rizzo, Matteo Barbieri et al.

The integration of quantum computers within classical High-Performance Computing (HPC) infrastructures is receiving increasing attention, with the former expected to serve as accelerators for specific computational tasks. However, combining HPC and quantum computers presents significant technical challenges, including resource allocation. This paper presents a novel malleability-based approach, alongside a workflow-based strategy, to optimize resource utilization in hybrid HPC-quantum workloads. With both these approaches, we can release classical resources when computations are offloaded to the quantum computer and reallocate them once quantum processing is complete. Our experiments with a hybrid HPC-quantum use case show the benefits of dynamic allocation, highlighting the potential of those solutions.

3.3DCMar 30, 2021

cuConv: A CUDA Implementation of Convolution for CNN Inference

Marc Jordà, Pedro Valero-Lara, Antonio J. Peña

Convolutions are the core operation of deep learning applications based on Convolutional Neural Networks (CNNs). Current GPU architectures are highly efficient for training and deploying deep CNNs, and hence, these are largely used in production for this purpose. State-of-the-art implementations, however, present a lack of efficiency for some commonly used network configurations. In this paper we propose a GPU-based implementation of the convolution operation for CNN inference that favors coalesced accesses, without requiring prior data transformations. Our experiments demonstrate that our proposal yields notable performance improvements in a range of common CNN forward propagation convolution configurations, with speedups of up to 2.29x with respect to the best implementation of convolution in cuDNN, hence covering a relevant region in currently existing approaches.

16.0CRMar 30, 2021

Enabling Homomorphically Encrypted Inference for Large DNN Models

Guillermo Lloret-Talavera, Marc Jorda, Harald Servat et al.

The proliferation of machine learning services in the last few years has raised data privacy concerns. Homomorphic encryption (HE) enables inference using encrypted data but it incurs 100x-10,000x memory and runtime overheads. Secure deep neural network (DNN) inference using HE is currently limited by computing and memory resources, with frameworks requiring hundreds of gigabytes of DRAM to evaluate small models. To overcome these limitations, in this paper we explore the feasibility of leveraging hybrid memory systems comprised of DRAM and persistent memory. In particular, we explore the recently-released Intel Optane PMem technology and the Intel HE-Transformer nGraph to run large neural networks such as MobileNetV2 (in its largest variant) and ResNet-50 for the first time in the literature. We present an in-depth analysis of the efficiency of the executions with different hardware and software configurations. Our results conclude that DNN inference using HE incurs on friendly access patterns for this memory configuration, yielding efficient executions.