Yang Duan

h-index15
2papers

2 Papers

LGOct 4, 2023
Towards out-of-distribution generalizable predictions of chemical kinetics properties

Zihao Wang, Yongqiang Chen, Yang Duan et al.

Machine Learning (ML) techniques have found applications in estimating chemical kinetic properties. With the accumulated drug molecules identified through "AI4drug discovery", the next imperative lies in AI-driven design for high-throughput chemical synthesis processes, with the estimation of properties of unseen reactions with unexplored molecules. To this end, the existing ML approaches for kinetics property prediction are required to be Out-Of-Distribution (OOD) generalizable. In this paper, we categorize the OOD kinetic property prediction into three levels (structure, condition, and mechanism), revealing unique aspects of such problems. Under this framework, we create comprehensive datasets to benchmark (1) the state-of-the-art ML approaches for reaction prediction in the OOD setting and (2) the state-of-the-art graph OOD methods in kinetics property prediction problems. Our results demonstrated the challenges and opportunities in OOD kinetics property prediction. Our datasets and benchmarks can further support research in this direction.

AIMar 3, 2024
Extending Complex Logical Queries on Uncertain Knowledge Graphs

Weizhi Fei, Zihao Wang, Hang Yin et al. · tsinghua

The study of machine learning-based logical query answering enables reasoning with large-scale and incomplete knowledge graphs. This paper advances this area of research by addressing the uncertainty inherent in knowledge. While the uncertain nature of knowledge is widely recognized in the real world, it does not align seamlessly with the first-order logic that underpins existing studies. To bridge this gap, we explore the soft queries on uncertain knowledge, inspired by the framework of soft constraint programming. We propose a neural symbolic approach that incorporates both forward inference and backward calibration to answer soft queries on large-scale, incomplete, and uncertain knowledge graphs. Theoretical discussions demonstrate that our method avoids catastrophic cascading errors in the forward inference while maintaining the same complexity as state-of-the-art symbolic methods for complex logical queries. Empirical results validate the superior performance of our backward calibration compared to extended query embedding methods and neural symbolic approaches.