Hongyi Ling

AI
h-index12
12papers
352citations
Novelty48%
AI Score48

12 Papers

LGJul 17, 2023
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems

Xuan Zhang, Limei Wang, Jacob Helwig et al. · cambridge, mit

Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences. Today, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Science is unique in that it is an enormous and highly interdisciplinary area. Thus, a unified and technical treatment of this field is needed yet challenging. This work aims to provide a technically thorough account of a subarea of AI4Science; namely, AI for quantum, atomistic, and continuum systems. These areas aim at understanding the physical world from the subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales and form an important subarea of AI4Science. A unique advantage of focusing on these areas is that they largely share a common set of challenges, thereby allowing a unified and foundational treatment. A key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods. We provide an in-depth yet intuitive account of techniques to achieve equivariance to symmetry transformations. We also discuss other common technical challenges, including explainability, out-of-distribution generalization, knowledge transfer with foundation and large language models, and uncertainty quantification. To facilitate learning and education, we provide categorized lists of resources that we found to be useful. We strive to be thorough and unified and hope this initial effort may trigger more community interests and efforts to further advance AI4Science.

99.8LGMar 16Code
Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning

Shubham Parashar, Shurui Gui, Xiner Li et al.

We aim to improve the reasoning capabilities of language models via reinforcement learning (RL). Recent RL post-trained models like DeepSeek-R1 have demonstrated reasoning abilities on mathematical and coding tasks. However, prior studies suggest that using RL alone to improve reasoning on inherently difficult tasks is less effective. Here, we draw inspiration from curriculum learning and propose to schedule tasks from easy to hard (E2H), allowing LLMs to build reasoning skills gradually. Our method is termed E2H Reasoner. Empirically, we observe that, although easy tasks are important initially, fading them out through appropriate scheduling is essential in preventing overfitting. Theoretically, we establish convergence guarantees for E2H Reasoner within an approximate policy iteration framework. We derive finite-sample complexity bounds and show that when tasks are appropriately decomposed and conditioned, learning through curriculum stages requires fewer total samples than direct learning. Experiments across multiple domains show that E2H Reasoner significantly improves the reasoning ability of small LLMs (1.5B to 3B), which otherwise struggle when trained with vanilla RL alone, highlighting the effectiveness of our method. Our code can be found on https://github.com/divelab/E2H-Reasoning.

LGJun 11, 2023
Graph Mixup with Soft Alignments

Hongyi Ling, Zhimeng Jiang, Meng Liu et al.

We study graph data augmentation by mixup, which has been used successfully on images. A key operation of mixup is to compute a convex combination of a pair of inputs. This operation is straightforward for grid-like data, such as images, but challenging for graph data. The key difficulty lies in the fact that different graphs typically have different numbers of nodes, and thus there lacks a node-level correspondence between graphs. In this work, we propose S-Mixup, a simple yet effective mixup method for graph classification by soft alignments. Specifically, given a pair of graphs, we explicitly obtain node-level correspondence via computing a soft assignment matrix to match the nodes between two graphs. Based on the soft assignments, we transform the adjacency and node feature matrices of one graph, so that the transformed graph is aligned with the other graph. In this way, any pair of graphs can be mixed directly to generate an augmented graph. We conduct systematic experiments to show that S-Mixup can improve the performance and generalization of graph neural networks (GNNs) on various graph classification tasks. In addition, we show that S-Mixup can increase the robustness of GNNs against noisy labels.

QUANT-PHJun 15, 2022
Lattice Convolutional Networks for Learning Ground States of Quantum Many-Body Systems

Cong Fu, Xuan Zhang, Huixin Zhang et al.

Deep learning methods have been shown to be effective in representing ground-state wave functions of quantum many-body systems. Existing methods use convolutional neural networks (CNNs) for square lattices due to their image-like structures. For non-square lattices, existing method uses graph neural network (GNN) in which structure information is not precisely captured, thereby requiring additional hand-crafted sublattice encoding. In this work, we propose lattice convolutions in which a set of proposed operations are used to convert non-square lattices into grid-like augmented lattices on which regular convolution can be applied. Based on the proposed lattice convolutions, we design lattice convolutional networks (LCN) that use self-gating and attention mechanisms. Experimental results show that our method achieves performance on par or better than existing methods on spin 1/2 $J_1$-$J_2$ Heisenberg model over the square, honeycomb, triangular, and kagome lattices while without using hand-crafted encoding.

71.0PLMar 20
Sound State Encodings in Translational Separation Logic Verifiers (Extended Version)

Hongyi Ling, Thibault Dardinier, Ellen Arlt et al.

Automated program verifiers are often organized into a front-end, which encodes an input program into an intermediate verification language (IVL), and a back-end, which proves that the IVL program is correct. Soundness of such translational verifiers requires that the back-end verification is sound and that correctness of the IVL program implies correctness of the input program. Existing formalizations for translational verifiers based on separation logic target the former, but support the latter only under the strong assumption that there exists a separation logic for the input program with the same state model as the IVL. This assumption is unrealistic in practice, especially since the state model also defines the supported separation logic resources. We present the first formal framework for proving the soundness of translational separation logic verifiers with non-trivial state encodings. To be applicable to various front-ends and IVLs, our framework only assumes the existence of a homomorphic encoding relation between the front-end and IVL state models. At the core of our framework is a novel condition, backward satisfiability, which is crucial to guarantee the soundness of the front-end translation. We formalize our framework for front-end verifiers based on concurrent separation logic and separation logic IVLs, such as Raven, VeriFast, and Viper. We demonstrate its expressiveness by proving soundness for three common state encodings. Our framework and all proofs are formalized in Isabelle/HOL.

AIFeb 18, 2025
Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights

Shubham Parashar, Blake Olson, Sambhav Khurana et al.

We examine the reasoning and planning capabilities of large language models (LLMs) in solving complex tasks. Recent advances in inference-time techniques demonstrate the potential to enhance LLM reasoning without additional training by exploring intermediate steps during inference. Notably, OpenAI's o1 model shows promising performance through its novel use of multi-step reasoning and verification. Here, we explore how scaling inference-time techniques can improve reasoning and planning, focusing on understanding the tradeoff between computational cost and performance. To this end, we construct a comprehensive benchmark, known as Sys2Bench, and perform extensive experiments evaluating existing inference-time techniques on eleven diverse tasks across five categories, including arithmetic reasoning, logical reasoning, common sense reasoning, algorithmic reasoning, and planning. Our findings indicate that simply scaling inference-time computation has limitations, as no single inference-time technique consistently performs well across all reasoning and planning tasks.

LGFeb 28, 2025
Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation

Keqiang Yan, Xiner Li, Hongyi Ling et al.

We consider the problem of crystal materials generation using language models (LMs). A key step is to convert 3D crystal structures into 1D sequences to be processed by LMs. Prior studies used the crystallographic information framework (CIF) file stream, which fails to ensure SE(3) and periodic invariance and may not lead to unique sequence representations for a given crystal structure. Here, we propose a novel method, known as Mat2Seq, to tackle this challenge. Mat2Seq converts 3D crystal structures into 1D sequences and ensures that different mathematical descriptions of the same crystal are represented in a single unique sequence, thereby provably achieving SE(3) and periodic invariance. Experimental results show that, with language models, Mat2Seq achieves promising performance in crystal structure generation as compared with prior methods.

AIFeb 26, 2025
Complex LLM Planning via Automated Heuristics Discovery

Hongyi Ling, Shubham Parashar, Sambhav Khurana et al.

We consider enhancing large language models (LLMs) for complex planning tasks. While existing methods allow LLMs to explore intermediate steps to make plans, they either depend on unreliable self-verification or external verifiers to evaluate these steps, which demand significant data and computations. Here, we propose automated heuristics discovery (AutoHD), a novel approach that enables LLMs to explicitly generate heuristic functions to guide inference-time search, allowing accurate evaluation of intermediate states. These heuristic functions are further refined through a heuristic evolution process, improving their robustness and effectiveness. Our proposed method requires no additional model training or fine-tuning, and the explicit definition of heuristic functions generated by the LLMs provides interpretability and insights into the reasoning process. Extensive experiments across diverse benchmarks demonstrate significant gains over multiple baselines, including nearly twice the accuracy on some datasets, establishing our approach as a reliable and interpretable solution for complex planning tasks.

AIJun 5, 2025
Toward Greater Autonomy in Materials Discovery Agents: Unifying Planning, Physics, and Scientists

Lianhao Zhou, Hongyi Ling, Keqiang Yan et al.

We aim at designing language agents with greater autonomy for crystal materials discovery. While most of existing studies restrict the agents to perform specific tasks within predefined workflows, we aim to automate workflow planning given high-level goals and scientist intuition. To this end, we propose Materials Agent unifying Planning, Physics, and Scientists, known as MAPPS. MAPPS consists of a Workflow Planner, a Tool Code Generator, and a Scientific Mediator. The Workflow Planner uses large language models (LLMs) to generate structured and multi-step workflows. The Tool Code Generator synthesizes executable Python code for various tasks, including invoking a force field foundation model that encodes physics. The Scientific Mediator coordinates communications, facilitates scientist feedback, and ensures robustness through error reflection and recovery. By unifying planning, physics, and scientists, MAPPS enables flexible and reliable materials discovery with greater autonomy, achieving a five-fold improvement in stability, uniqueness, and novelty rates compared with prior generative models when evaluated on the MP-20 data. We provide extensive experiments across diverse tasks to show that MAPPS is a promising framework for autonomous materials discovery.

AIOct 10, 2025
Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

Lianhao Zhou, Hongyi Ling, Cong Fu et al.

Computing has long served as a cornerstone of scientific discovery. Recently, a paradigm shift has emerged with the rise of large language models (LLMs), introducing autonomous systems, referred to as agents, that accelerate discovery across varying levels of autonomy. These language agents provide a flexible and versatile framework that orchestrates interactions with human scientists, natural language, computer language and code, and physics. This paper presents our view and vision of LLM-based scientific agents and their growing role in transforming the scientific discovery lifecycle, from hypothesis discovery, experimental design and execution, to result analysis and refinement. We critically examine current methodologies, emphasizing key innovations, practical achievements, and outstanding limitations. Additionally, we identify open research challenges and outline promising directions for building more robust, generalizable, and adaptive scientific agents. Our analysis highlights the transformative potential of autonomous agents to accelerate scientific discovery across diverse domains.

ROOct 31, 2020
Pose Estimation of Specular and Symmetrical Objects

Jiaming Hu, Hongyi Ling, Priyam Parashar et al.

In the robotic industry, specular and textureless metallic components are ubiquitous. The 6D pose estimation of such objects with only a monocular RGB camera is difficult because of the absence of rich texture features. Furthermore, the appearance of specularity heavily depends on the camera viewpoint and environmental light conditions making traditional methods, like template matching, fail. In the last 30 years, pose estimation of the specular object has been a consistent challenge, and most related works require massive knowledge modeling effort for light setups, environment, or the object surface. On the other hand, recent works exhibit the feasibility of 6D pose estimation on a monocular camera with convolutional neural networks(CNNs) however they mostly use opaque objects for evaluation. This paper provides a data-driven solution to estimate the 6D pose of specular objects for grasping them, proposes a cost function for handling symmetry, and demonstrates experimental results showing the system's feasibility.

CRFeb 17, 2020
An Efficient Permissioned Blockchain with Provable Reputation Mechanism

Hongyin Chen, Zhaohua Chen, Yukun Cheng et al.

The design of permissioned blockchains places an access control requirement for members to read, access, and write information over the blockchains. In this paper, we study a hierarchical scenario to include three types of participants: providers, collectors, and governors. To be specific, providers forward transactions, collected from terminals, to collectors; collectors upload received transactions to governors after verifying and labeling them; and governors validate a part of received labeled transactions, pack valid ones into a block, and append a new block on the ledger. Collectors in the hierarchical model play a crucial role in the design: they have connections with both providers and governors, and are responsible for collecting, verifying, and uploading transactions. However, collectors are rational and some of them may behave maliciously (not necessarily for their own benefits). In this paper, we introduce a reputation protocol as a measure of the reliability of collectors in the permissioned blockchain environment. Its objective is to encourage collectors to behave truthfully and, in addition, to reduce the verification cost. The verification cost on provider $p$ is defined as the total number of invalid transactions provided by $p$ and checked by governors. Through theoretical analysis, our protocol with the reputation mechanism has a significant improvement in efficiency. Specifically, the verification loss that governors suffer is proved to be asymptotically $O(\sqrt{T_{total}})$ ($T_{total}$, representing the number of transactions verified by governors and provided by $p$), as long as there exists at least one collector who behaves well. At last, two typical cases where our model can be well applied are also demonstrated.