8.8QUANT-PHApr 4
GPU-Accelerated Quantum Simulation: Empirical Backend Selection, Gate Fusion, and Adaptive PrecisionPoornima Kumaresan, Pavithra Muruganantham, Lakshmi Rajendran et al.
Classical simulation of quantum circuits remains indispensable for algorithm development, hardware validation, and error analysis in the noisy intermediate-scale quantum (NISQ) era. However, state-vector simulation faces exponential memory scaling, with an n-qubit system requiring O(2^n) complex amplitudes, and existing simulators often lack the flexibility to exploit heterogeneous computing resources at runtime. This paper presents a GPU-accelerated quantum circuit simulation framework that introduces three contributions: (1) an empirical backend selection algorithm that benchmarks CuPy, PyTorch-CUDA, and NumPy-CPU backends at runtime and selects the optimal execution path based on measured throughput; (2) a directed acyclic graph (DAG) based gate fusion engine that reduces circuit depth through automated identification of fusible gate sequences, coupled with adaptive precision switching between complex64 and complex128 representations; and (3) a memory-aware fallback mechanism that monitors GPU memory consumption and gracefully degrades to CPU execution when resources are exhausted. The framework integrates with Qiskit, Cirq, PennyLane, and Amazon Braket through a unified adapter layer. Benchmarks on an NVIDIA A100-SXM4 (40 GiB) GPU demonstrate speedups of 64x to 146x over NumPy CPU execution for state-vector simulation of circuits with 20 to 28 qubits, with speedups exceeding 5x from 16 qubits onward. Hardware validation on an IBM quantum processing unit (QPU) confirms Bell state fidelity of 0.939, a five-qubit Greenberger-Horne-Zeilinger (GHZ) state fidelity of 0.853, and circuit depth reduction from 42 to 14 gates through the fusion pipeline. The system is designed for portability across NVIDIA consumer and data-center GPUs, requiring no vendor-specific compilation steps.
45.4ETApr 6
Eliminating Vendor Lock-In in Quantum Machine Learning via Framework-Agnostic Neural NetworksPoornima Kumaresan, Shwetha Singaravelu, Lakshmi Rajendran et al.
Quantum machine learning (QML) stands at the intersection of quantum computing and artificial intelligence, offering the potential to solve problems that remain intractable for classical methods. However, the current landscape of QML software frameworks suffers from severe fragmentation: models developed in TensorFlow Quantum cannot execute on PennyLane backends, circuits authored in Qiskit Machine Learning cannot be deployed to Amazon Braket hardware, and researchers who invest in one ecosystem face prohibitive switching costs when migrating to another. This vendor lock-in impedes reproducibility, limits hardware access, and slows the pace of scientific discovery. In this paper, we present a framework-agnostic quantum neural network (QNN) architecture that abstracts away vendor-specific interfaces through a unified computational graph, a hardware abstraction layer (HAL), and a multi-framework export pipeline. The core architecture supports simultaneous integration with TensorFlow, PyTorch, and JAX as classical co-processors, while the HAL provides transparent access to IBM Quantum, Amazon Braket, Azure Quantum, IonQ, and Rigetti backends through a single application programming interface (API). We introduce three pluggable data encoding strategies (amplitude, angle, and instantaneous quantum polynomial encoding) that are compatible with all supported backends. An export module leveraging Open Neural Network Exchange (ONNX) metadata enables lossless circuit translation across Qiskit, Cirq, PennyLane, and Braket representations. We benchmark our framework on the Iris, Wine, and MNIST-4 classification tasks, demonstrating training time parity (within 8\% overhead) compared to native framework implementations, while achieving identical classification accuracy.
10.0ETApr 5
Adaptive Tensor Network Simulation via Entropy-Feedback PID Control and GPU-Accelerated SVDHarshni Kumaresan, Gayathri Muruganantham, Lakshmi Rajendran et al.
Tensor network methods, particularly those based on Matrix Product States (MPS), provide a powerful framework for simulating quantum many-body systems. A persistent computational challenge in these methods is the selection of the bond dimension chi, which controls the trade-off between accuracy and computational cost. Fixed bond dimension strategies either waste resources in low-entanglement regions or lose fidelity in high-entanglement regions. This work introduces an adaptive bond dimension management framework that uses von Neumann entropy feedback coupled with a Proportional-Integral-Derivative (PID) controller to dynamically adjust chi at each bond during simulation. An Exponential Moving Average (EMA) filter stabilizes entropy measurements against transient fluctuations, and a predictive scheduling module anticipates future bond dimension requirements from entropy trends. The per-bond granularity of the allocation ensures that computational resources concentrate where entanglement is largest. The framework integrates GPU-accelerated Singular Value Decomposition (SVD) via CuPy and the cuSOLVER backend, achieving individual SVD speedups of 4.1x at chi=256 and 7.1x at chi=2048 relative to CPU-based NumPy for isolated matrix factorisations (measured on an NVIDIA A100-SXM4-40GB GPU with CuPy 13.4.1 and CUDA 12.8). At the system level, benchmarks on the spin-1/2 antiferromagnetic Heisenberg chain demonstrate a 2.7x reduction in total DMRG wall time compared to fixed-chi simulations, with energy accuracy within 0.1% of the Bethe ansatz solution. Integration with the Density Matrix Renormalization Group (DMRG) algorithm yields ground-state energies per site converging to E/N = -0.4432 for the isotropic Heisenberg model at chi = 128. Validation against Amazon Web Services (AWS) Braket SV1 statevector simulator confirms agreement within 2-5% for small systems.