LGApr 2, 2022
Forestry digital twin with machine learning in Landsat 7 dataXuetao Jiang, Meiyu Jiang, YuChun Gou et al.
Modeling forests using historical data allows for more accurately evolution analysis, thus providing an important basis for other studies. As a recognized and effective tool, remote sensing plays an important role in forestry analysis. We can use it to derive information about the forest, including tree type, coverage and canopy density. There are many forest time series modeling studies using statistic values, but few using remote sensing images. Image prediction digital twin is an implementation of digital twin, which aims to predict future images bases on historical data. In this paper, we propose an LSTM-based digital twin approach for forest modeling, using Landsat 7 remote sensing image within 20 years. The experimental results show that the prediction twin method in this paper can effectively predict the future images of study area.
LGJan 14, 2023
Day-Ahead PV Power Forecasting Based on MSTL-TFTXuetao Jiang, Meiyu Jiang, Qingguo Zhou
In recent years, renewable energy resources have accounted for an increasing share of electricity energy.Among them, photovoltaic (PV) power generation has received broad attention due to its economic and environmental benefits.Accurate PV generation forecasts can reduce power dispatch from the grid, thus increasing the supplier's profit in the day-ahead electricity market.The power system of a PV site is affected by solar radiation, PV plant properties and meteorological factors, resulting in uncertainty in its power output.This study used multiple seasonal-trend decomposition using LOESS (MSTL) and temporal fusion transformer (TFT) to perform day-ahead PV prediction on the desert knowledge Australia solar centre (DKASC) dataset.We compare the decomposition algorithms (VMD, EEMD and VMD-EEMD) and prediction models (BP, LSTM and XGBoost, etc.) which are commonly used in PV prediction presently.The results show that the MSTL-TFT method is more accurate than the aforementioned methods, which have noticeable improvement compared to other recent day-ahead PV predictions on desert knowledge Australia solar centre (DKASC).
LGFeb 2, 2023
A novel automatic wind power prediction framework based on multi-time scale and temporal attention mechanismsMeiyu Jiang, Jun Shen, Xuetao Jiang et al.
Wind energy is a widely distributed, renewable, and environmentally friendly energy source that plays a crucial role in mitigating global warming and addressing energy shortages. Nevertheless, wind power generation is characterized by volatility, intermittence, and randomness, which hinder its ability to serve as a reliable power source for the grid. Accurate wind power forecasting is crucial for developing a new power system that heavily relies on renewable energy sources. However, traditional wind power forecasting systems primarily focus on ultra-short-term or short-term forecasts, limiting their ability to address the diverse adjustment requirements of the power system simultaneously. To overcome these challenges, We propose an automatic framework capable of forecasting wind power across multi-time scale. The framework based on the tree-structured Parzen estimator (TPE) and temporal fusion transformer (TFT) that can provide ultra-short-term, short-term and medium-term wind power forecasting power.Our approach employs the TFT for wind power forecasting and categorizes features based on their properties. Additionally, we introduce a generic algorithm to simultaneously fine-tune the hyperparameters of the decomposition method and model. We evaluate the performance of our framework by conducting ablation experiments using three commonly used decomposition algorithms and six state-of-the-art models for forecasting multi-time scale. The experimental results demonstrate that our proposed method considerably improves prediction accuracy on the public dataset Engie https://opendata-renewables.engie.com. Compared to the second-best state-of-the-art model, our approach exhibits a reduction of 31.75% and 28.74% in normalized mean absolute error (nMAE) for 24-hour forecasting, and 20.79% and 16.93% in nMAE for 48-hour forecasting, respectively.
AIDec 23, 2025
Bohrium + SciMaster: Building the Infrastructure and Ecosystem for Agentic Science at ScaleLinfeng Zhang, Siheng Chen, Yuzhu Cai et al.
AI agents are emerging as a practical way to run multi-step scientific workflows that interleave reasoning with tool use and verification, pointing to a shift from isolated AI-assisted steps toward \emph{agentic science at scale}. This shift is increasingly feasible, as scientific tools and models can be invoked through stable interfaces and verified with recorded execution traces, and increasingly necessary, as AI accelerates scientific output and stresses the peer-review and publication pipeline, raising the bar for traceability and credible evaluation. However, scaling agentic science remains difficult: workflows are hard to observe and reproduce; many tools and laboratory systems are not agent-ready; execution is hard to trace and govern; and prototype AI Scientist systems are often bespoke, limiting reuse and systematic improvement from real workflow signals. We argue that scaling agentic science requires an infrastructure-and-ecosystem approach, instantiated in Bohrium+SciMaster. Bohrium acts as a managed, traceable hub for AI4S assets -- akin to a HuggingFace of AI for Science -- that turns diverse scientific data, software, compute, and laboratory systems into agent-ready capabilities. SciMaster orchestrates these capabilities into long-horizon scientific workflows, on which scientific agents can be composed and executed. Between infrastructure and orchestration, a \emph{scientific intelligence substrate} organizes reusable models, knowledge, and components into executable building blocks for workflow reasoning and action, enabling composition, auditability, and improvement through use. We demonstrate this stack with eleven representative master agents in real workflows, achieving orders-of-magnitude reductions in end-to-end scientific cycle time and generating execution-grounded signals from real workloads at multi-million scale.
AIFeb 3
KANFIS A Neuro-Symbolic Framework for Interpretable and Uncertainty-Aware LearningBinbin Yong, Haoran Pei, Jun Shen et al.
Adaptive Neuro-Fuzzy Inference System (ANFIS) was designed to combine the learning capabilities of neural network with the reasoning transparency of fuzzy logic. However, conventional ANFIS architectures suffer from structural complexity, where the product-based inference mechanism causes an exponential explosion of rules in high-dimensional spaces. We herein propose the Kolmogorov-Arnold Neuro-Fuzzy Inference System (KANFIS), a compact neuro-symbolic architecture that unifies fuzzy reasoning with additive function decomposition. KANFIS employs an additive aggregation mechanism, under which both model parameters and rule complexity scale linearly with input dimensionality rather than exponentially. Furthermore, KANFIS is compatible with both Type-1 (T1) and Interval Type-2 (IT2) fuzzy logic systems, enabling explicit modeling of uncertainty and ambiguity in fuzzy representations. By using sparse masking mechanisms, KANFIS generates compact and structured rule sets, resulting in an intrinsically interpretable model with clear rule semantics and transparent inference processes. Empirical results demonstrate that KANFIS achieves competitive performance against representative neural and neuro-fuzzy baselines.
CLFeb 27
Full-Stack Domain Enhancement for Combustion LLMs: Construction and OptimizationQuanjia Xiao, Weimin Ouyang, Zonglin Yang et al.
Large language models (LLMs) in the direction of task adaptation and capability enhancement for professional fields demonstrate significant application potential. Nevertheless, for complex physical systems such as combustion science, general-purpose LLMs often generate severe hallucinations due to insufficient domain knowledge and the inability to adhere to physical conservation laws. To address this issue, we propose the first full-stack domain-enhanced LLM workflow tailored for the field of combustion science, which integrates automated domain corpus construction, incremental pre-training, instruction fine-tuning, and verifiable reward-based reinforcement learning. This workflow ensures that the model truly internalizes physical laws rather than merely learning textual statistical patterns. We also release FlameBench, a standardized evaluation benchmark specifically designed for complex reasoning tasks in combustion science. Experimental results demonstrate that the model developed in this work significantly outperforms state-of-the-art general-purpose closed-source models and traditional retrieval-augmented generation methods on combustion science reasoning tasks. This work lays a solid technical and resource foundation for the subsequent development of domain-specific scientific research agents with reliable scientific reasoning capabilities.
AIFeb 5
Traceable Cross-Source RAG for Chinese Tibetan Medicine Question AnsweringFengxian Chen, Zhilong Tao, Jiaxuan Li et al.
Retrieval-augmented generation (RAG) promises grounded question answering, yet domain settings with multiple heterogeneous knowledge bases (KBs) remain challenging. In Chinese Tibetan medicine, encyclopedia entries are often dense and easy to match, which can dominate retrieval even when classics or clinical papers provide more authoritative evidence. We study a practical setting with three KBs (encyclopedia, classics, and clinical papers) and a 500-query benchmark (cutoff $K{=}5$) covering both single-KB and cross-KB questions. We propose two complementary methods to improve traceability, reduce hallucinations, and enable cross-KB verification. First, DAKS performs KB routing and budgeted retrieval to mitigate density-driven bias and to prioritize authoritative sources when appropriate. Second, we use an alignment graph to guide evidence fusion and coverage-aware packing, improving cross-KB evidence coverage without relying on naive concatenation. All answers are generated by a lightweight generator, \textsc{openPangu-Embedded-7B}. Experiments show consistent gains in routing quality and cross-KB evidence coverage, with the full system achieving the best CrossEv@5 while maintaining strong faithfulness and citation correctness.
CVDec 17, 2024
MapExpert: Online HD Map Construction with Simple and Efficient Sparse Map Element ExpertDapeng Zhang, Dayu Chen, Peng Zhi et al.
Constructing online High-Definition (HD) maps is crucial for the static environment perception of autonomous driving systems (ADS). Existing solutions typically attempt to detect vectorized HD map elements with unified models; however, these methods often overlook the distinct characteristics of different non-cubic map elements, making accurate distinction challenging. To address these issues, we introduce an expert-based online HD map method, termed MapExpert. MapExpert utilizes sparse experts, distributed by our routers, to describe various non-cubic map elements accurately. Additionally, we propose an auxiliary balance loss function to distribute the load evenly across experts. Furthermore, we theoretically analyze the limitations of prevalent bird's-eye view (BEV) feature temporal fusion methods and introduce an efficient temporal fusion module called Learnable Weighted Moving Descentage. This module effectively integrates relevant historical information into the final BEV features. Combined with an enhanced slice head branch, the proposed MapExpert achieves state-of-the-art performance and maintains good efficiency on both nuScenes and Argoverse2 datasets.
ROSep 23, 2025
Pure Vision Language Action (VLA) Models: A Comprehensive SurveyDapeng Zhang, Jing Sun, Chenghui Hu et al.
The emergence of Vision Language Action (VLA) models marks a paradigm shift from traditional policy-based control to generalized robotics, reframing Vision Language Models (VLMs) from passive sequence generators into active agents for manipulation and decision-making in complex, dynamic environments. This survey delves into advanced VLA methods, aiming to provide a clear taxonomy and a systematic, comprehensive review of existing research. It presents a comprehensive analysis of VLA applications across different scenarios and classifies VLA approaches into several paradigms: autoregression-based, diffusion-based, reinforcement-based, hybrid, and specialized methods; while examining their motivations, core strategies, and implementations in detail. In addition, foundational datasets, benchmarks, and simulation platforms are introduced. Building on the current VLA landscape, the review further proposes perspectives on key challenges and future directions to advance research in VLA models and generalizable robotics. By synthesizing insights from over three hundred recent studies, this survey maps the contours of this rapidly evolving field and highlights the opportunities and challenges that will shape the development of scalable, general-purpose VLA methods.
AIApr 18, 2024
DST-GTN: Dynamic Spatio-Temporal Graph Transformer Network for Traffic ForecastingSongtao Huang, Hongjin Song, Tianqi Jiang et al.
Accurate traffic forecasting is essential for effective urban planning and congestion management. Deep learning (DL) approaches have gained colossal success in traffic forecasting but still face challenges in capturing the intricacies of traffic dynamics. In this paper, we identify and address this challenges by emphasizing that spatial features are inherently dynamic and change over time. A novel in-depth feature representation, called Dynamic Spatio-Temporal (Dyn-ST) features, is introduced, which encapsulates spatial characteristics across varying times. Moreover, a Dynamic Spatio-Temporal Graph Transformer Network (DST-GTN) is proposed by capturing Dyn-ST features and other dynamic adjacency relations between intersections. The DST-GTN can model dynamic ST relationships between nodes accurately and refine the representation of global and local ST characteristics by adopting adaptive weights in low-pass and all-pass filters, enabling the extraction of Dyn-ST features from traffic time-series data. Through numerical experiments on public datasets, the DST-GTN achieves state-of-the-art performance for a range of traffic forecasting tasks and demonstrates enhanced stability.
CVNov 25, 2025
Reasoning-VLA: A Fast and General Vision-Language-Action Reasoning Model for Autonomous DrivingDapeng Zhang, Zhenlong Yuan, Zhangquan Chen et al.
Vision-Language-Action (VLA) models have recently shown strong decision-making capabilities in autonomous driving. However, existing VLAs often struggle with achieving efficient inference and generalizing to novel autonomous vehicle configurations and driving scenarios. In this paper, we propose Reasoning-VLA, a general and fast action-generation VLA framework. The proposed model employs a set of learnable action queries, initialized via Gaussian sampling from ground-truth trajectories within the training corpus. These learnable queries interact with reasoning-enhanced vision-language features to generate continuous action trajectories in parallel. To promote robust generalization, we consolidate eight publicly available autonomous driving datasets into a standardized, Chain-of-Thought reasoning-based, and easy-to-use data format for model training. Leveraging both supervised learning and reinforcement learning fine-tuning, extensive empirical evaluations across multiple benchmarks demonstrate that Reasoning-VLA achieves state-of-the-art performance, superior generalization capability, and the excellent inference speed reported to date.
CVNov 2, 2021
Boundary Distribution Estimation for Precise Object DetectionPeng Zhi, Haoran Zhou, Hang Huang et al.
In the field of state-of-the-art object detection, the task of object localization is typically accomplished through a dedicated subnet that emphasizes bounding box regression. This subnet traditionally predicts the object's position by regressing the box's center position and scaling factors. Despite the widespread adoption of this approach, we have observed that the localization results often suffer from defects, leading to unsatisfactory detector performance. In this paper, we address the shortcomings of previous methods through theoretical analysis and experimental verification and present an innovative solution for precise object detection. Instead of solely focusing on the object's center and size, our approach enhances the accuracy of bounding box localization by refining the box edges based on the estimated distribution at the object's boundary. Experimental results demonstrate the potential and generalizability of our proposed method.
AIOct 28, 2020
Crop and weed classification based on AutoMLXuetao Jiang, Binbin Yong, Soheila Garshasbi et al.
CNN models already play an important role in classification of crop and weed with high accuracy, more than 95% as reported in literature. However, to manually choose and fine-tune the deep learning models becomes laborious and indispensable in most traditional practices and research. Moreover, the classic objective functions are not thoroughly compatible with agricultural farming tasks as the corresponding models suffer from misclassifying crop to weed, often more likely than in other deep learning application domains. In this paper, we applied autonomous machine learning with a new objective function for crop and weed classification, achieving higher accuracy and lower crop killing rate (rate of identifying a crop as a weed). The experimental results show that our method outperforms state-of-the-art applications, for example, ResNet and VGG19.