Alex Kalinowski

AI
4papers
23citations
Novelty29%
AI Score21

4 Papers

AISep 20, 2023
Knowledge Graph Question Answering for Materials Science (KGQA4MAT): Developing Natural Language Interface for Metal-Organic Frameworks Knowledge Graph (MOF-KG) Using LLM

Yuan An, Jane Greenberg, Alex Kalinowski et al.

We present a comprehensive benchmark dataset for Knowledge Graph Question Answering in Materials Science (KGQA4MAT), with a focus on metal-organic frameworks (MOFs). A knowledge graph for metal-organic frameworks (MOF-KG) has been constructed by integrating structured databases and knowledge extracted from the literature. To enhance MOF-KG accessibility for domain experts, we aim to develop a natural language interface for querying the knowledge graph. We have developed a benchmark comprised of 161 complex questions involving comparison, aggregation, and complicated graph structures. Each question is rephrased in three additional variations, resulting in 644 questions and 161 KG queries. To evaluate the benchmark, we have developed a systematic approach for utilizing the LLM, ChatGPT, to translate natural language questions into formal KG queries. We also apply the approach to the well-known QALD-9 dataset, demonstrating ChatGPT's potential in addressing KGQA issues for different platforms and query languages. The benchmark and the proposed approach aim to stimulate further research and development of user-friendly and efficient interfaces for querying domain-specific materials science knowledge graphs, thereby accelerating the discovery of novel materials.

AIJul 10, 2022
Building Open Knowledge Graph for Metal-Organic Frameworks (MOF-KG): Challenges and Case Studies

Yuan An, Jane Greenberg, Xintong Zhao et al.

Metal-Organic Frameworks (MOFs) are a class of modular, porous crystalline materials that have great potential to revolutionize applications such as gas storage, molecular separations, chemical sensing, catalysis, and drug delivery. The Cambridge Structural Database (CSD) reports 10,636 synthesized MOF crystals which in addition contains ca. 114,373 MOF-like structures. The sheer number of synthesized (plus potentially synthesizable) MOF structures requires researchers pursue computational techniques to screen and isolate MOF candidates. In this demo paper, we describe our effort on leveraging knowledge graph methods to facilitate MOF prediction, discovery, and synthesis. We present challenges and case studies about (1) construction of a MOF knowledge graph (MOF-KG) from structured and unstructured sources and (2) leveraging the MOF-KG for discovery of new or missing knowledge.

AIJul 22, 2022
Exploring Wasserstein Distance across Concept Embeddings for Ontology Matching

Yuan An, Alex Kalinowski, Jane Greenberg

Measuring the distance between ontological elements is fundamental for ontology matching. String-based distance metrics are notorious for shallow syntactic matching. In this exploratory study, we investigate Wasserstein distance targeting continuous space that can incorporate various types of information. We use a pre-trained word embeddings system to embed ontology element labels. We examine the effectiveness of Wasserstein distance for measuring similarity between ontologies, and discovering and refining matchings between individual elements. Our experiments with the OAEI conference track and MSE benchmarks achieved competitive results compared to the leading systems.

AIAug 20, 2024
Is the Lecture Engaging for Learning? Lecture Voice Sentiment Analysis for Knowledge Graph-Supported Intelligent Lecturing Assistant (ILA) System

Yuan An, Samarth Kolanupaka, Jacob An et al.

This paper introduces an intelligent lecturing assistant (ILA) system that utilizes a knowledge graph to represent course content and optimal pedagogical strategies. The system is designed to support instructors in enhancing student learning through real-time analysis of voice, content, and teaching methods. As an initial investigation, we present a case study on lecture voice sentiment analysis, in which we developed a training set comprising over 3,000 one-minute lecture voice clips. Each clip was manually labeled as either engaging or non-engaging. Utilizing this dataset, we constructed and evaluated several classification models based on a variety of features extracted from the voice clips. The results demonstrate promising performance, achieving an F1-score of 90% for boring lectures on an independent set of over 800 test voice clips. This case study lays the groundwork for the development of a more sophisticated model that will integrate content analysis and pedagogical practices. Our ultimate goal is to aid instructors in teaching more engagingly and effectively by leveraging modern artificial intelligence techniques.