Zujie Xie

AIJul 29, 2025

LUMIR: an LLM-Driven Unified Agent Framework for Multi-task Infrared Spectroscopy Reasoning

Zujie Xie, Zixuan Chen, Jiheng Liang et al.

Infrared spectroscopy enables rapid, non destructive analysis of chemical and material properties, yet high dimensional signals and overlapping bands hinder conventional chemometric methods. Large language models (LLMs), with strong generalization and reasoning capabilities, offer new opportunities for automated spectral interpretation, but their potential in this domain remains largely untapped. This study introduces LUMIR (LLM-driven Unified agent framework for Multi-task Infrared spectroscopy Reasoning), an agent based framework designed to achieve accurate infrared spectral analysis under low data conditions. LUMIR integrates a structured literature knowledge base, automated preprocessing, feature extraction, and predictive modeling into a unified pipeline. By mining peer reviewed spectroscopy studies, it identifies validated preprocessing and feature derivation strategies, transforms spectra into low dimensional representations, and applies few-shot prompts for classification, regression, and anomaly detection. The framework was validated on diverse datasets, including the publicly available Milk near-infrared dataset, Chinese medicinal herbs, Citri Reticulatae Pericarpium(CRP) with different storage durations, an industrial wastewater COD dataset, and two additional public benchmarks, Tecator and Corn. Across these tasks, LUMIR achieved performance comparable to or surpassing established machine learning and deep learning models, particularly in resource limited settings. This work demonstrates that combining structured literature guidance with few-shot learning enables robust, scalable, and automated spectral interpretation. LUMIR establishes a new paradigm for applying LLMs to infrared spectroscopy, offering high accuracy with minimal labeled data and broad applicability across scientific and industrial domains.

LGJun 21, 2025

Towards a Unified Textual Graph Framework for Spectral Reasoning via Physical and Chemical Information Fusion

Jiheng Liang, Ziru Yu, Zujie Xie et al.

Motivated by the limitations of current spectral analysis methods-such as reliance on single-modality data, limited generalizability, and poor interpretability-we propose a novel multi-modal spectral analysis framework that integrates prior knowledge graphs with Large Language Models. Our method explicitly bridges physical spectral measurements and chemical structural semantics by representing them in a unified Textual Graph format, enabling flexible, interpretable, and generalizable spectral understanding. Raw spectra are first transformed into TAGs, where nodes and edges are enriched with textual attributes describing both spectral properties and chemical context. These are then merged with relevant prior knowledge-including functional groups and molecular graphs-to form a Task Graph that incorporates "Prompt Nodes" supporting LLM-based contextual reasoning. A Graph Neural Network further processes this structure to complete downstream tasks. This unified design enables seamless multi-modal integration and automated feature decoding with minimal manual annotation. Our framework achieves consistently high performance across multiple spectral analysis tasks, including node-level, edge-level, and graph-level classification. It demonstrates robust generalization in both zero-shot and few-shot settings, highlighting its effectiveness in learning from limited data and supporting in-context reasoning. This work establishes a scalable and interpretable foundation for LLM-driven spectral analysis, unifying physical and chemical modalities for scientific applications.

Zujie Xie

2 Papers