CLJun 3
Stepwise Reasoning Enhancement for LLMs via External Subgraph GenerationXin Zhang, Yang Cao, Baoxing Wu et al.
Large language models have shown strong performance in natural language generation and downstream reasoning tasks, but they still struggle with logical consistency, factual grounding, and interpretability in complex multi-step reasoning. To address these limitations, this paper proposes SGR, a stepwise reasoning enhancement framework that integrates large language models with external knowledge graphs through query-relevant subgraph generation. Given an input question, SGR first extracts key entities, relations, and constraints to construct a structured schema, then retrieves compact subgraphs from a knowledge graph using schema-guided querying. The generated subgraphs provide explicit relational evidence that guides the language model through step-by-step reasoning. In addition, SGR combines direct Cypher-based reasoning with collaborative reasoning integration, allowing candidate answers from multiple reasoning paths to be validated and aggregated according to both model confidence and graph consistency. Experiments on benchmark datasets including CWQ, WebQSP, GrailQA, and KQA Pro demonstrate that SGR improves reasoning accuracy and Hits@1 performance over standard prompting and several knowledge-enhanced baselines. Ablation studies further show that schema guidance and Neo4j-based retrieval are both crucial to the effectiveness of the framework. These results indicate that dynamically generated external subgraphs can improve the accuracy, robustness, and interpretability of LLM-based reasoning.
CVAug 16, 2024Code
SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image SegmentationXinyu Xiong, Zihuang Wu, Shuangyi Tan et al.
Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}.
SYMay 12
Renewable-Colocated Green Hydrogen Production: Optimal Scheduling and ProfitabilitySiying Li, Lang Tong, Timothy Mount et al.
We study the optimal green hydrogen production and energy market participation of a renewable-colocated hydrogen producer (RCHP) that utilizes onsite renewable generation for both hydrogen production and grid services. Under deterministic and stochastic profit-maximization frameworks, we analyze RCHP's multiple market participation models and derive closed-form optimal scheduling policies that dynamically allocate renewable energy to hydrogen production and electricity export to the wholesale market. Analytical characterizations of the RCHP's operating profit and the optimal sizing of renewable and electrolyzer capacities are obtained. We use real-time renewable generation and electricity price data from three independent system operators to evaluate the impacts of market prices and environmental policies on RCHP's profitability.
CLMay 15
SGR: A Stepwise Reasoning Framework for LLMs with External Subgraph GenerationXin Zhang, Yang Cao, Baoxing Wu et al.
Large Language Models (LLMs) have demonstrated strong capabilities across diverse NLP applications, such as translation, text generation, and question answering. Nevertheless, they remain limited in complex settings that demand deep reasoning and logical inference. Since these models are trained on large-scale text corpora, their generation process may still introduce irrelevant, noisy, or factually inconsistent content. To mitigate this problem, we introduce SGR, a stepwise framework that enhances LLM reasoning through external subgraph generation. SGR builds query-specific subgraphs from external knowledge bases and uses their semantic structure to support multi-step inference. By grounding intermediate reasoning steps in structured external knowledge, the framework helps the model concentrate on relevant entities, relations, and supporting evidence. In particular, SGR first constructs a subgraph tailored to the input question. It then guides the model to reason progressively over the generated structure and combines multiple reasoning trajectories to obtain the final prediction. Experimental results across several benchmark datasets show that SGR achieves consistent improvements over competitive baselines, highlighting its value for improving both reasoning accuracy and factual reliability.
CVMay 12
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify ArchitectureHaiwen Diao, Penghao Wu, Hanming Deng et al.
Recent large vision-language models (VLMs) remain fundamentally constrained by a persistent dichotomy: understanding and generation are treated as distinct problems, leading to fragmented architectures, cascaded pipelines, and misaligned representation spaces. We argue that this divide is not merely an engineering artifact, but a structural limitation that hinders the emergence of native multimodal intelligence. Hence, we introduce SenseNova-U1, a native unified multimodal paradigm built upon NEO-unify, in which understanding and generation evolve as synergistic views of a single underlying process. We launch two native unified variants, SenseNova-U1-8B-MoT and SenseNova-U1-A3B-MoT, built on dense (8B) and mixture-of-experts (30B-A3B) understanding baselines, respectively. Designed from first principles, they rival top-tier understanding-only VLMs across text understanding, vision-language perception, knowledge reasoning, agentic decision-making, and spatial intelligence. Meanwhile, they deliver strong semantic consistency and visual fidelity, excelling in conventional or knowledge-intensive any-to-image (X2I) synthesis, complex text-rich infographic generation, and interleaved vision-language generation, with or without think patterns. Beyond performance, we show detailed model design, data preprocessing, pre-/post-training, and inference strategies to support community research. Last but not least, preliminary evidence demonstrates that our models extend beyond perception and generation, performing strongly in vision-language-action (VLA) and world model (WM) scenarios. This points toward a broader roadmap where models do not translate between modalities, but think and act across them in a native manner. Multimodal AI is no longer about connecting separate systems, but about building a unified one and trusting the necessary capabilities to emerge from within.
CLDec 29, 2025
A Stepwise-Enhanced Reasoning Framework for Large Language Models Based on External Subgraph GenerationXin Zhang, Yang Cao, Baoxing Wu et al.
Large Language Models (LLMs) have achieved strong performance across a wide range of natural language processing tasks in recent years, including machine translation, text generation, and question answering. As their applications extend to increasingly complex scenarios, however, LLMs continue to face challenges in tasks that require deep reasoning and logical inference. In particular, models trained on large scale textual corpora may incorporate noisy or irrelevant information during generation, which can lead to incorrect predictions or outputs that are inconsistent with factual knowledge. To address this limitation, we propose a stepwise reasoning enhancement framework for LLMs based on external subgraph generation, termed SGR. The proposed framework dynamically constructs query relevant subgraphs from external knowledge bases and leverages their semantic structure to guide the reasoning process. By performing reasoning in a step by step manner over structured subgraphs, SGR reduces the influence of noisy information and improves reasoning accuracy. Specifically, the framework first generates an external subgraph tailored to the input query, then guides the model to conduct multi step reasoning grounded in the subgraph, and finally integrates multiple reasoning paths to produce the final answer. Experimental results on multiple benchmark datasets demonstrate that SGR consistently outperforms strong baselines, indicating its effectiveness in enhancing the reasoning capabilities of LLMs.
CLFeb 17, 2024
Knowledge Graph Assisted Automatic Sports News WritingYang Cao, Xinyi Chen, Xin Zhang et al.
In this paper, we present a novel method for automatically generating sports news, which employs a unique algorithm that extracts pivotal moments from live text broadcasts and uses them to create an initial draft of the news. This draft is further refined by incorporating key details and background information from a specially designed sports knowledge graph. This graph contains 5,893 entities, which are classified into three distinct conceptual categories, interconnected through four relationship types, and characterized by 27 unique attributes. In addition, we create a multi-stage learning model by combining convolutional neural networks and a transformer encoder. This model expresses entity-task interactions using convolutional neural networks and enriches entity representations in the query set with the transformer encoder. It also includes a processor to compute matching scores for incomplete triples, addressing few-shot knowledge graph completion problem. The efficiency of this approach has been confirmed through both subjective and objective evaluations of 50 selected test cases, demonstrating its capability in revolutionizing the creation of sports news.
OCJul 4, 2025
Energy Management for Renewable-Colocated Artificial Intelligence Data CentersSiying Li, Lang Tong, Timothy D. Mount
We develop an energy management system (EMS) for artificial intelligence (AI) data centers with colocated renewable generation. Under a cost-minimizing framework, the EMS of renewable-colocated data center (RCDC) co-optimizes AI workload scheduling, on-site renewable utilization, and electricity market participation. Within both wholesale and retail market participation models, the economic benefit of the RCDC operation is maximized. Empirical evaluations using real-world traces of electricity prices, data center power consumption, and renewable generation demonstrate significant electricity cost reduction from renewable and AI data center colocations.