Yunqing Li

AI
h-index32
6papers
99citations
Novelty52%
AI Score38

6 Papers

MMMar 7, 2022
A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

Qing Wang, Jun Du, Siyuan Zheng et al. · gatech, nvidia

In this paper, we propose two techniques, namely joint modeling and data augmentation, to improve system performances for audio-visual scene classification (AVSC). We employ pre-trained networks trained only on image data sets to extract video embedding; whereas for audio embedding models, we decide to train them from scratch. We explore different neural network architectures for joint modeling to effectively combine the video and audio modalities. Moreover, data augmentation strategies are investigated to increase audio-visual training set size. For the video modality the effectiveness of several operations in RandAugment is verified. An audio-video joint mixup scheme is proposed to further improve AVSC performances. Evaluated on the development set of TAU Urban Audio Visual Scenes 2021, our final system can achieve the best accuracy of 94.2% among all single AVSC systems submitted to DCASE 2021 Task 1b.

AIApr 9, 2024
Building A Knowledge Graph to Enrich ChatGPT Responses in Manufacturing Service Discovery

Yunqing Li, Binil Starly

Sourcing and identification of new manufacturing partners is crucial for manufacturing system integrators to enhance agility and reduce risk through supply chain diversification in the global economy. The advent of advanced large language models has captured significant interest, due to their ability to generate comprehensive and articulate responses across a wide range of knowledge domains. However, the system often falls short in accuracy and completeness when responding to domain-specific inquiries, particularly in areas like manufacturing service discovery. This research explores the potential of leveraging Knowledge Graphs in conjunction with ChatGPT to streamline the process for prospective clients in identifying small manufacturing enterprises. In this study, we propose a method that integrates bottom-up ontology with advanced machine learning models to develop a Manufacturing Service Knowledge Graph from an array of structured and unstructured data sources, including the digital footprints of small-scale manufacturers throughout North America. The Knowledge Graph and the learned graph embedding vectors are leveraged to tackle intricate queries within the digital supply chain network, responding with enhanced reliability and greater interpretability. The approach highlighted is scalable to millions of entities that can be distributed to form a global Manufacturing Service Knowledge Network Graph that can potentially interconnect multiple types of Knowledge Graphs that span industry sectors, geopolitical boundaries, and business domains. The dataset developed for this study, now publicly accessible, encompasses more than 13,000 manufacturers' weblinks, manufacturing services, certifications, and location entity types.

LGMar 25, 2024
Manufacturing Service Capability Prediction with Graph Neural Networks

Yunqing Li, Xiaorui Liu, Binil Starly

In the current landscape, the predominant methods for identifying manufacturing capabilities from manufacturers rely heavily on keyword matching and semantic matching. However, these methods often fall short by either overlooking valuable hidden information or misinterpreting critical data. Consequently, such approaches result in an incomplete identification of manufacturers' capabilities. This underscores the pressing need for data-driven solutions to enhance the accuracy and completeness of manufacturing capability identification. To address the need, this study proposes a Graph Neural Network-based method for manufacturing service capability identification over a knowledge graph. To enhance the identification performance, this work introduces a novel approach that involves aggregating information from the graph nodes' neighborhoods as well as oversampling the graph data, which can be effectively applied across a wide range of practical scenarios. Evaluations conducted on a Manufacturing Service Knowledge Graph and subsequent ablation studies demonstrate the efficacy and robustness of the proposed approach. This study not only contributes a innovative method for inferring manufacturing service capabilities but also significantly augments the quality of Manufacturing Service Knowledge Graphs.

CLDec 12, 2024
OG-RAG: Ontology-Grounded Retrieval-Augmented Generation For Large Language Models

Kartik Sharma, Peeyush Kumar, Yunqing Li

This paper presents OG-RAG, an Ontology-Grounded Retrieval Augmented Generation method designed to enhance LLM-generated responses by anchoring retrieval processes in domain-specific ontologies. While LLMs are widely used for tasks like question answering and search, they struggle to adapt to specialized knowledge, such as industrial workflows or knowledge work, without expensive fine-tuning or sub-optimal retrieval methods. Existing retrieval-augmented models, such as RAG, offer improvements but fail to account for structured domain knowledge, leading to suboptimal context generation. Ontologies, which conceptually organize domain knowledge by defining entities and their interrelationships, offer a structured representation to address this gap. OG-RAG constructs a hypergraph representation of domain documents, where each hyperedge encapsulates clusters of factual knowledge grounded using domain-specific ontology. An optimization algorithm then retrieves the minimal set of hyperedges that constructs a precise, conceptually grounded context for the LLM. This method enables efficient retrieval while preserving the complex relationships between entities. OG-RAG applies to domains where fact-based reasoning is essential, particularly in tasks that require workflows or decision-making steps to follow predefined rules and procedures. These include industrial workflows in healthcare, legal, and agricultural sectors, as well as knowledge-driven tasks such as news journalism, investigative research, consulting and more. Our evaluations demonstrate that OG-RAG increases the recall of accurate facts by 55% and improves response correctness by 40% across four different LLMs. Additionally, OG-RAG enables 30% faster attribution of responses to context and boosts fact-based reasoning accuracy by 27% compared to baseline methods.

AISep 22, 2025
Towards General Computer Control with Hierarchical Agents and Multi-Level Action Spaces

Zihan Dong, Xinyu Fan, Zixiang Tang et al.

Controlling desktop applications via software remains a fundamental yet under-served problem. Existing multi-modal large language models (MLLMs) ingest screenshots and task instructions to generate keystrokes and mouse events, but they suffer from prohibitive inference latency, poor sample efficiency on long-horizon sparse-reward tasks, and infeasible on-device deployment. We introduce a lightweight hierarchical reinforcement learning framework, ComputerAgent, that formulates OS control as a two-level option process (manager and subpolicy), employs a triple-modal state encoder (screenshot, task ID, numeric state) to handle visual and contextual diversity, integrates meta-actions with an early-stop mechanism to reduce wasted interactions, and uses a compact vision backbone plus small policy networks for on-device inference (15M parameters). On a suite of 135 real-world desktop tasks, ComputerAgent attains 92.1% success on simple tasks (<8 steps) and 58.8% on hard tasks (>=8 steps), matching or exceeding 200B-parameter MLLM baselines on simple scenarios while reducing model size by over four orders of magnitude and halving inference time. These results demonstrate that hierarchical RL offers a practical, scalable alternative to monolithic MLLM-based automation for computer control.

LGAug 11, 2025
C-MAG: Cascade Multimodal Attributed Graphs for Supply Chain Link Prediction

Yunqing Li, Zixiang Tang, Jiaying Zhuang et al.

Workshop version accepted at KDD 2025 (AI4SupplyChain). Connecting an ever-expanding catalogue of products with suitable manufacturers and suppliers is critical for resilient, efficient global supply chains, yet traditional methods struggle to capture complex capabilities, certifications, geographic constraints, and rich multimodal data of real-world manufacturer profiles. To address these gaps, we introduce PMGraph, a public benchmark of bipartite and heterogeneous multimodal supply-chain graphs linking 8,888 manufacturers, over 70k products, more than 110k manufacturer-product edges, and over 29k product images. Building on this benchmark, we propose the Cascade Multimodal Attributed Graph C-MAG, a two-stage architecture that first aligns and aggregates textual and visual attributes into intermediate group embeddings, then propagates them through a manufacturer-product hetero-graph via multiscale message passing to enhance link prediction accuracy. C-MAG also provides practical guidelines for modality-aware fusion, preserving predictive performance in noisy, real-world settings.