Filippo Utro

9.4LGApr 22, 2025

Jannis Born, Filip Skogh, Kahn Rhrissorrakrai et al.

At the core of the Transformer, the softmax normalizes the attention matrix to be right stochastic. Previous research has shown that this often de-stabilizes training and that enforcing the attention matrix to be doubly stochastic (through Sinkhorn's algorithm) consistently improves performance across different tasks, domains and Transformer flavors. However, Sinkhorn's algorithm is iterative, approximative, non-parametric and thus inflexible w.r.t. the obtained doubly stochastic matrix (DSM). Recently, it has been proven that DSMs can be obtained with a parametric quantum circuit, yielding a novel quantum inductive bias for DSMs with no known classical analogue. Motivated by this, we demonstrate the feasibility of a hybrid classical-quantum doubly stochastic Transformer (QDSFormer) that replaces the softmax in the self-attention layer with a variational quantum circuit. We study the expressive power of the circuit and find that it yields more diverse DSMs that better preserve information than classical operators. Across multiple small-scale object recognition tasks, we find that our QDSFormer consistently surpasses both a standard ViT and other doubly stochastic Transformers. Beyond the Sinkformer, this comparison includes a novel quantum-inspired doubly stochastic Transformer (based on QR decomposition) that can be of independent interest. Our QDSFormer also shows improved training stability and lower performance variation suggesting that it may mitigate the notoriously unstable training of ViTs on small-scale data.

4.1LGJun 2, 2025

Quantum Ensembling Methods for Healthcare and Life Science

Kahn Rhrissorrakrai, Kathleen E. Hamilton, Prerana Bangalore Parthsarathy et al.

Learning on small data is a challenge frequently encountered in many real-world applications. In this work we study how effective quantum ensemble models are when trained on small data problems in healthcare and life sciences. We constructed multiple types of quantum ensembles for binary classification using up to 26 qubits in simulation and 56 qubits on quantum hardware. Our ensemble designs use minimal trainable parameters but require long-range connections between qubits. We tested these quantum ensembles on synthetic datasets and gene expression data from renal cell carcinoma patients with the task of predicting patient response to immunotherapy. From the performance observed in simulation and initial hardware experiments, we demonstrate how quantum embedding structure affects performance and discuss how to extract informative features and build models that can learn and generalize effectively. We present these exploratory results in order to assist other researchers in the design of effective learning on small data using ensembles. Incorporating quantum computing in these data constrained problems offers hope for a wide range of studies in healthcare and life sciences where biological samples are relatively scarce given the feature space to be explored.

Filippo Utro

2 Papers