Liangliang Liu

AI
h-index16
6papers
27citations
Novelty48%
AI Score45

6 Papers

LGJul 31, 2024Code
ProSpec RL: Plan Ahead, then Execute

Liangliang Liu, Yi Guan, BoRan Wang et al.

Imagining potential outcomes of actions before execution helps agents make more informed decisions, a prospective thinking ability fundamental to human cognition. However, mainstream model-free Reinforcement Learning (RL) methods lack the ability to proactively envision future scenarios, plan, and guide strategies. These methods typically rely on trial and error to adjust policy functions, aiming to maximize cumulative rewards or long-term value, even if such high-reward decisions place the environment in extremely dangerous states. To address this, we propose the Prospective (ProSpec) RL method, which makes higher-value, lower-risk optimal decisions by imagining future n-stream trajectories. Specifically, ProSpec employs a dynamic model to predict future states (termed "imagined states") based on the current state and a series of sampled actions. Furthermore, we integrate the concept of Model Predictive Control and introduce a cycle consistency constraint that allows the agent to evaluate and select the optimal actions from these trajectories. Moreover, ProSpec employs cycle consistency to mitigate two fundamental issues in RL: augmenting state reversibility to avoid irreversible events (low risk) and augmenting actions to generate numerous virtual trajectories, thereby improving data efficiency. We validated the effectiveness of our method on the DMControl benchmarks, where our approach achieved significant performance improvements. Code will be open-sourced upon acceptance.

AIJul 31, 2024Code
Fi$^2$VTS: Time Series Forecasting Via Capturing Intra- and Inter-Variable Variations in the Frequency Domain

Rujia Shen, Yang Yang, Yaoxion Lin et al.

Time series forecasting (TSF) plays a crucial role in various applications, including medical monitoring and crop growth. Despite the advancements in deep learning methods for TSF, their capacity to predict long-term series remains constrained. This limitation arises from the failure to account for both intra- and inter-variable variations meanwhile. To mitigate this challenge, we introduce the Fi$^2$VBlock, which leverages a \textbf{F}requency domain perspective to capture \textbf{i}ntra- and \textbf{i}nter-variable \textbf{V}ariations. After transforming into the frequency domain via the Frequency Transform Module, the Frequency Cross Attention between the real and imaginary parts is designed to obtain enhanced frequency representations and capture intra-variable variations. Furthermore, Inception blocks are employed to integrate information, thus capturing correlations across different variables. Our backbone network, Fi$^2$VTS, employs a residual architecture by concatenating multiple Fi$^2$VBlocks, thereby preventing degradation issues. Theoretically, we demonstrate that Fi$^2$VTS achieves a substantial reduction in both time and memory complexity, decreasing from $\mathcal{O}(L^2)$ to $\mathcal{O}(L)$ per Fi$^2$VBlock computation. Empirical evaluations reveal that Fi$^2$VTS outperforms other baselines on two benchmark datasets. The implementation code is accessible at \url{https://github.com/HITshenrj/Fi2VTS}.

CLJul 29, 2025Code
AgriEval: A Comprehensive Chinese Agricultural Benchmark for Large Language Models

Lian Yan, Haotian Wang, Chen Tang et al.

In the agricultural domain, the deployment of large language models (LLMs) is hindered by the lack of training data and evaluation benchmarks. To mitigate this issue, we propose AgriEval, the first comprehensive Chinese agricultural benchmark with three main characteristics: (1) Comprehensive Capability Evaluation. AgriEval covers six major agriculture categories and 29 subcategories within agriculture, addressing four core cognitive scenarios: memorization, understanding, inference, and generation. (2) High-Quality Data. The dataset is curated from university-level examinations and assignments, providing a natural and robust benchmark for assessing the capacity of LLMs to apply knowledge and make expert-like decisions. (3) Diverse Formats and Extensive Scale. AgriEval comprises 14,697 multiple-choice questions and 2,167 open-ended question-and-answer questions, establishing it as the most extensive agricultural benchmark available to date. We also present comprehensive experimental results over 51 open-source and commercial LLMs. The experimental results reveal that most existing LLMs struggle to achieve 60% accuracy, underscoring the developmental potential in agricultural LLMs. Additionally, we conduct extensive experiments to investigate factors influencing model performance and propose strategies for enhancement. AgriEval is available at https://github.com/YanPioneer/AgriEval/.

AIFeb 15
REAL: Resolving Knowledge Conflicts in Knowledge-Intensive Visual Question Answering via Reasoning-Pivot Alignment

Kai Ye, Xianwei Mao, Sheng Zhou et al.

Knowledge-intensive Visual Question Answering (KI-VQA) frequently suffers from severe knowledge conflicts caused by the inherent limitations of open-domain retrieval. However, existing paradigms face critical limitations due to the lack of generalizable conflict detection and intra-model constraint mechanisms to handle conflicting evidence. To address these challenges, we propose the REAL (Reasoning-Pivot Alignment) framework centered on the novel concept of the Reasoning-Pivot. Distinct from reasoning steps that prioritize internal self-derivation, a reasoning-pivot serves as an atomic unit (node or edge) in the reasoning chain that emphasizes knowledge linkage, and it typically relies on external evidence to complete the reasoning. Supported by our constructed REAL-VQA dataset, our approach integrates Reasoning-Pivot Aware SFT (RPA-SFT) to train a generalizable discriminator by aligning conflicts with pivot extraction, and employs Reasoning-Pivot Guided Decoding (RPGD), an intra-model decoding strategy that leverages these pivots for targeted conflict mitigation. Extensive experiments across diverse benchmarks demonstrate that REAL significantly enhances discrimination accuracy and achieves state-of-the-art performance, validating the effectiveness of our pivot-driven resolution paradigm.

CVApr 23, 2021
Sequential convolutional network for behavioral pattern extraction in gait recognition

Xinnan Ding, Kejun Wang, Chenhui Wang et al.

As a unique and promising biometric, video-based gait recognition has broad applications. The key step of this methodology is to learn the walking pattern of individuals, which, however, often suffers challenges to extract the behavioral feature from a sequence directly. Most existing methods just focus on either the appearance or the motion pattern. To overcome these limitations, we propose a sequential convolutional network (SCN) from a novel perspective, where spatiotemporal features can be learned by a basic convolutional backbone. In SCN, behavioral information extractors (BIE) are constructed to comprehend intermediate feature maps in time series through motion templates where the relationship between frames can be analyzed, thereby distilling the information of the walking pattern. Furthermore, a multi-frame aggregator in SCN performs feature integration on a sequence whose length is uncertain, via a mobile 3D convolutional layer. To demonstrate the effectiveness, experiments have been conducted on two popular public benchmarks, CASIA-B and OU-MVLP, and our approach is demonstrated superior performance, comparing with the state-of-art methods.

SDNov 29, 2020
An Features Extraction and Recognition Method for Underwater Acoustic Target Based on ATCNN

Gang Hu, Kejun Wang, Liangliang Liu

Facing the complex marine environment, it is extremely challenging to conduct underwater acoustic target recognition (UATR) using ship-radiated noise. Inspired by neural mechanism of auditory perception, this paper provides a new deep neural network trained by original underwater acoustic signals with depthwise separable convolution (DWS) and time-dilated convolution neural network, named auditory perception inspired time-dilated convolution neural network (ATCNN), and then implements detection and classification for underwater acoustic signals. The proposed ATCNN model consists of learnable features extractor and integration layer inspired by auditory perception, and time-dilated convolution inspired by language model. This paper decomposes original time-domain ship-radiated noise signals into different frequency components with depthwise separable convolution filter, and then extracts signal features based on auditory perception. The deep features are integrated on integration layer. The time-dilated convolution is used for long-term contextual modeling. As a result, like language model, intra-class and inter-class information can be fully used for UATR. For UATR task, the classification accuracy reaches 90.9%, which is the highest in contrast experiment. Experimental results show that ATCNN has great potential to improve the performance of UATR classification.