CL AIOct 12, 2025

Quantum NLP models on Natural Language Inference

Ling Sun, Peter Sullivan, Michael Martin, Yun Zhou

arXiv:2510.15972v12.7

Originality Incremental advance

AI Analysis

This work addresses the challenge of low-resource, structure-sensitive NLP tasks by demonstrating the potential of quantum models, though it is incremental as it builds on existing QNLP frameworks like lambeq and DisCoCat.

The paper tackled the problem of applying quantum natural language processing (QNLP) models to Natural Language Inference (NLI) in few-shot settings, achieving performance comparable to classical baselines with dramatically fewer parameters and up to five orders of magnitude higher per-parameter learning efficiency.

Quantum natural language processing (QNLP) offers a novel approach to semantic modeling by embedding compositional structure directly into quantum circuits. This paper investigates the application of QNLP models to the task of Natural Language Inference (NLI), comparing quantum, hybrid, and classical transformer-based models under a constrained few-shot setting. Using the lambeq library and the DisCoCat framework, we construct parameterized quantum circuits for sentence pairs and train them for both semantic relatedness and inference classification. To assess efficiency, we introduce a novel information-theoretic metric, Information Gain per Parameter (IGPP), which quantifies learning dynamics independent of model size. Our results demonstrate that quantum models achieve performance comparable to classical baselines while operating with dramatically fewer parameters. The Quantum-based models outperform randomly initialized transformers in inference and achieve lower test error on relatedness tasks. Moreover, quantum models exhibit significantly higher per-parameter learning efficiency (up to five orders of magnitude more than classical counterparts), highlighting the promise of QNLP in low-resource, structure-sensitive settings. To address circuit-level isolation and promote parameter sharing, we also propose a novel cluster-based architecture that improves generalization by tying gate parameters to learned word clusters rather than individual tokens.

View on arXiv PDF

Similar