23.1LGMay 19
Fast Tensorization of Neural Networks via Slice-wise Feature DistillationSafa Hamreras, Sukhbinder Singh, Román Orús
We propose a scalable tensorization framework for neural network compression based on slice-wise feature distillation. Unlike conventional tensor decomposition methods that rely on costly global finetuning, our approach decomposes the network into slices consisting of either individual layers or blocks (e.g., convolutional layers or MLPs), or small groups of consecutive layers, and tensorizes each slice independently to reproduce the intermediate representations of the original pretrained model. This modular strategy improves accuracy recovery, reduces data requirements, and enables efficient parallel optimization. Experiments on ResNet-34 show significant gains over conventional global tensorization, achieving near-lossless compression at moderate compression rates with faster optimization. Results on GPT-2 XL further demonstrate the scalability of the method and its applicability to large-scale models, particularly in distributed settings.
QUANT-PHMar 22, 2024
Image Classification with Rotation-Invariant Variational Quantum CircuitsPaul San Sebastian, Mikel Cañizo, Román Orús
Variational quantum algorithms are gaining attention as an early application of Noisy Intermediate-Scale Quantum (NISQ) devices. One of the main problems of variational methods lies in the phenomenon of Barren Plateaus, present in the optimization of variational parameters. Adding geometric inductive bias to the quantum models has been proposed as a potential solution to mitigate this problem, leading to a new field called Geometric Quantum Machine Learning. In this work, an equivariant architecture for variational quantum classifiers is introduced to create a label-invariant model for image classification with $C_4$ rotational label symmetry. The equivariant circuit is benchmarked against two different architectures, and it is experimentally observed that the geometric approach boosts the model's performance. Finally, a classical equivariant convolution operation is proposed to extend the quantum model for the processing of larger images, employing the resources available in NISQ devices.
AIDec 18, 2025
Scaling Laws for Energy Efficiency of Local LLMsAnder Alvarez, Alessandro Genuardi, Nilotpal Sinha et al.
Deploying local large language models and vision-language models on edge devices requires balancing accuracy with constrained computational and energy budgets. Although graphics processors dominate modern artificial-intelligence deployment, most consumer hardware--including laptops, desktops, industrial controllers, and embedded systems--relies on central processing units. Despite this, the computational laws governing central-processing-unit-only inference for local language and vision-language workloads remain largely unexplored. We systematically benchmark large language and vision-language models on two representative central-processing-unit tiers widely used for local inference: a MacBook Pro M2, reflecting mainstream laptop-class deployment, and a Raspberry Pi 5, representing constrained, low-power embedded settings. Using a unified methodology based on continuous sampling of processor and memory usage together with area-under-curve integration, we characterize how computational load scales with input text length for language models and with image resolution for vision-language models. We uncover two empirical scaling laws: (1) computational cost for language-model inference scales approximately linearly with token length; and (2) vision-language models exhibit a preprocessing-driven "resolution knee", where compute remains constant above an internal resolution clamp and decreases sharply below it. Beyond these laws, we show that quantum-inspired compression reduces processor and memory usage by up to 71.9% and energy consumption by up to 62%, while preserving or improving semantic accuracy. These results provide a systematic quantification of multimodal central-processing-unit-only scaling for local language and vision-language workloads, and they identify model compression and input-resolution preprocessing as effective, low-cost levers for sustainable edge inference.
LGMay 23, 2025
A tensor network approach for chaotic time series predictionRodrigo Martínez-Peña, Román Orús
Making accurate predictions of chaotic time series is a complex challenge. Reservoir computing, a neuromorphic-inspired approach, has emerged as a powerful tool for this task. It exploits the memory and nonlinearity of dynamical systems without requiring extensive parameter tuning. However, selecting and optimizing reservoir architectures remains an open problem. Next-generation reservoir computing simplifies this problem by employing nonlinear vector autoregression based on truncated Volterra series, thereby reducing hyperparameter complexity. Nevertheless, the latter suffers from exponential parameter growth in terms of the maximum monomial degree. Tensor networks offer a promising solution to this issue by decomposing multidimensional arrays into low-dimensional structures, thus mitigating the curse of dimensionality. This paper explores the application of a previously proposed tensor network model for predicting chaotic time series, demonstrating its advantages in terms of accuracy and computational efficiency compared to conventional echo state networks. Using a state-of-the-art tensor network approach enables us to bridge the gap between the tensor network and reservoir computing communities, fostering advances in both fields.
LGNov 26, 2025
Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixingRoman Rausch, David Jansen, Sukhbinder Singh et al.
Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid of its parameter redundancy. In this work, we present two physics-inspired improvements to SVD LLM compression: (1) \textbf{FermiGrad}, a gradient-descent algorithm that determines globally optimal layer-wise ranks by relaxing the discrete singular-value truncation into a continuous optimization using the Fermi function; (2) \textbf{PivGa}, an additional \textit{lossless} compression of the low-rank factors that exploits the intrinsic gauge freedom in their parametrization.
NIAug 31, 2025
Quantum-based QoE Optimization in Advanced Cellular Networks: Integration and Cloud Gaming Use CaseFatma Chaouech, Javier Villegas, António Pereira et al.
This work explores the integration of Quantum Machine Learning (QML) and Quantum-Inspired (QI) techniques for optimizing end-to-end (E2E) network services in telecommunication systems, particularly focusing on 5G networks and beyond. The application of QML and QI algorithms is investigated, comparing their performance with classical Machine Learning (ML) approaches. The present study employs a hybrid framework combining quantum and classical computing leveraging the strengths of QML and QI, without the penalty of quantum hardware availability. This is particularized for the optimization of the Quality of Experience (QoE) over cellular networks. The framework comprises an estimator for obtaining the expected QoE based on user metrics, service settings, and cell configuration, and an optimizer that uses the estimation to choose the best cell and service configuration. Although the approach is applicable to any QoE-based network management, its implementation is particularized for the optimization of network configurations for Cloud Gaming services. Then, it is evaluated via performance metrics such as accuracy and model loading and inference times for the estimator, and time to solution and solution score for the optimizer. The results indicate that QML models achieve similar or superior accuracy to classical ML models for estimation, while decreasing inference and loading times. Furthermore, potential for better performance is observed for higher-dimensional data, highlighting promising results for higher complexity problems. Thus, the results demonstrate the promising potential of QML in advancing network optimization, although challenges related to data availability and integration complexities between quantum and classical ML are identified as future research lines.
LGAug 12, 2025
Blockchain Network Analysis using Quantum Inspired Graph Neural Networks & Ensemble ModelsLuigi D'Amico, Daniel De Rosso, Ninad Dixit et al.
In the rapidly evolving domain of financial technology, the detection of illicit transactions within blockchain networks remains a critical challenge, necessitating robust and innovative solutions. This work proposes a novel approach by combining Quantum Inspired Graph Neural Networks (QI-GNN) with flexibility of choice of an Ensemble Model using QBoost or a classic model such as Random Forrest Classifier. This system is tailored specifically for blockchain network analysis in anti-money laundering (AML) efforts. Our methodology to design this system incorporates a novel component, a Canonical Polyadic (CP) decomposition layer within the graph neural network framework, enhancing its capability to process and analyze complex data structures efficiently. Our technical approach has undergone rigorous evaluation against classical machine learning implementations, achieving an F2 score of 74.8% in detecting fraudulent transactions. These results highlight the potential of quantum-inspired techniques, supplemented by the structural advancements of the CP layer, to not only match but potentially exceed traditional methods in complex network analysis for financial security. The findings advocate for a broader adoption and further exploration of quantum-inspired algorithms within the financial sector to effectively combat fraud.
CVJul 10, 2025
Lightweight Cloud Masking Models for On-Board Inference in Hyperspectral ImagingMazen Ali, António Pereira, Fabio Gentile et al.
Cloud and cloud shadow masking is a crucial preprocessing step in hyperspectral satellite imaging, enabling the extraction of high-quality, analysis-ready data. This study evaluates various machine learning approaches, including gradient boosting methods such as XGBoost and LightGBM as well as convolutional neural networks (CNNs). All boosting and CNN models achieved accuracies exceeding 93%. Among the investigated models, the CNN with feature reduction emerged as the most efficient, offering a balance of high accuracy, low storage requirements, and rapid inference times on both CPUs and GPUs. Variations of this version, with only up to 597 trainable parameters, demonstrated the best trade-off in terms of deployment feasibility, accuracy, and computational efficiency. These results demonstrate the potential of lightweight artificial intelligence (AI) models for real-time hyperspectral image processing, supporting the development of on-board satellite AI systems for space-based applications.
LGDec 27, 2024
Ultralight Signal Classification Model for Automatic Modulation RecognitionAlessandro Daniele Genuardi Oquendo, Agustín Matías Galante Cerviño, Nilotpal Kanti Sinha et al.
The growing complexity of radar signals demands responsive and accurate detection systems that can operate efficiently on resource-constrained edge devices. Existing models, while effective, often rely on substantial computational resources and large datasets, making them impractical for edge deployment. In this work, we propose an ultralight hybrid neural network optimized for edge applications, delivering robust performance across unfavorable signal-to-noise ratios (mean accuracy of 96.3% at 0 dB) using less than 100 samples per class, and significantly reducing computational overhead.