CLApr 29, 2022
Detecting Textual Adversarial Examples Based on Distributional Characteristics of Data RepresentationsNa Liu, Mark Dras, Wei Emma Zhang
Although deep neural networks have achieved state-of-the-art performance in various machine learning tasks, adversarial examples, constructed by adding small non-random perturbations to correctly classified inputs, successfully fool highly expressive deep classifiers into incorrect predictions. Approaches to adversarial attacks in natural language tasks have boomed in the last five years using character-level, word-level, phrase-level, or sentence-level textual perturbations. While there is some work in NLP on defending against such attacks through proactive methods, like adversarial training, there is to our knowledge no effective general reactive approaches to defence via detection of textual adversarial examples such as is found in the image processing literature. In this paper, we propose two new reactive methods for NLP to fill this gap, which unlike the few limited application baselines from NLP are based entirely on distribution characteristics of learned representations: we adapt one from the image processing literature (Local Intrinsic Dimensionality (LID)), and propose a novel one (MultiDistance Representation Ensemble Method (MDRE)). Adapted LID and MDRE obtain state-of-the-art results on character-level, word-level, and phrase-level attacks on the IMDB dataset as well as on the later two with respect to the MultiNLI dataset. For future research, we publish our code.
ACC-PHJun 2, 2016
Tuner control system of spoke012 SRF cavity for C-ADS injector I at IHEPNa Liu, Yi Sun, Guang-Wei Wang et al.
A new tuner control system of spoke superconducting radio frequency (SRF) cavity has been developed and applied to cryomodule I (CM1) of C-ADS injector I at IHEP. We have successfully implemented the tuner controllerfor the first time and achieved a cavity tuning phase error of 0.7degrees (about 4 Hz peak to peak) in the presence of electromechanical coupled resonance. This paper will present the preliminary experimental results based on the new tuner controller under proton beam commissioning.
CLOct 27, 2023
DUMA: a Dual-Mind Conversational Agent with Fast and Slow ThinkingXiaoyu Tian, Liangyu Chen, Na Liu et al.
Inspired by the dual-process theory of human cognition, we introduce DUMA, a novel conversational agent framework that embodies a dual-mind mechanism through the utilization of two generative Large Language Models (LLMs) dedicated to fast and slow thinking respectively. The fast thinking model serves as the primary interface for external interactions and initial response generation, evaluating the necessity for engaging the slow thinking model based on the complexity of the complete response. When invoked, the slow thinking model takes over the conversation, engaging in meticulous planning, reasoning, and tool utilization to provide a well-analyzed response. This dual-mind configuration allows for a seamless transition between intuitive responses and deliberate problem-solving processes based on the situation. We have constructed a conversational agent to handle online inquiries in the real estate industry. The experiment proves that our method balances effectiveness and efficiency, and has a significant improvement compared to the baseline.
CLJan 5, 2024
From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language ModelsNa Liu, Liangyu Chen, Xiaoyu Tian et al.
This paper introduces RAISE (Reasoning and Acting through Scratchpad and Examples), an advanced architecture enhancing the integration of Large Language Models (LLMs) like GPT-4 into conversational agents. RAISE, an enhancement of the ReAct framework, incorporates a dual-component memory system, mirroring human short-term and long-term memory, to maintain context and continuity in conversations. It entails a comprehensive agent construction scenario, including phases like Conversation Selection, Scene Extraction, CoT Completion, and Scene Augmentation, leading to the LLMs Training phase. This approach appears to enhance agent controllability and adaptability in complex, multi-turn dialogues. Our preliminary evaluations in a real estate sales context suggest that RAISE has some advantages over traditional agents, indicating its potential for broader applications. This work contributes to the AI field by providing a robust framework for developing more context-aware and versatile conversational agents.
LGFeb 6, 2024
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded ModellingJunchao Gong, Lei Bai, Peng Ye et al.
Precipitation nowcasting based on radar data plays a crucial role in extreme weather prediction and has broad implications for disaster management. Despite progresses have been made based on deep learning, two key challenges of precipitation nowcasting are not well-solved: (i) the modeling of complex precipitation system evolutions with different scales, and (ii) accurate forecasts for extreme precipitation. In this work, we propose CasCast, a cascaded framework composed of a deterministic and a probabilistic part to decouple the predictions for mesoscale precipitation distributions and small-scale patterns. Then, we explore training the cascaded framework at the high resolution and conducting the probabilistic modeling in a low dimensional latent space with a frame-wise-guided diffusion transformer for enhancing the optimization of extreme events while reducing computational costs. Extensive experiments on three benchmark radar precipitation datasets show that CasCast achieves competitive performance. Especially, CasCast significantly surpasses the baseline (up to +91.8%) for regional extreme-precipitation nowcasting.
LGMar 13, 2024
A Physics-driven GraphSAGE Method for Physical Process Simulations Described by Partial Differential EquationsHang Hu, Sidi Wu, Guoxiong Cai et al.
Physics-informed neural networks (PINNs) have successfully addressed various computational physics problems based on partial differential equations (PDEs). However, while tackling issues related to irregularities like singularities and oscillations, trained solutions usually suffer low accuracy. In addition, most current works only offer the trained solution for predetermined input parameters. If any change occurs in input parameters, transfer learning or retraining is required, and traditional numerical techniques also need an independent simulation. In this work, a physics-driven GraphSAGE approach (PD-GraphSAGE) based on the Galerkin method and piecewise polynomial nodal basis functions is presented to solve computational problems governed by irregular PDEs and to develop parametric PDE surrogate models. This approach employs graph representations of physical domains, thereby reducing the demands for evaluated points due to local refinement. A distance-related edge feature and a feature mapping strategy are devised to help training and convergence for singularity and oscillation situations, respectively. The merits of the proposed method are demonstrated through a couple of cases. Moreover, the robust PDE surrogate model for heat conduction problems parameterized by the Gaussian random field source is successfully established, which not only provides the solution accurately but is several times faster than the finite element method in our experiments.
CVJun 5, 2025
MARS: Radio Map Super-resolution and Reconstruction Method under Sparse Channel MeasurementsChuyun Deng, Na Liu, Wei Xie et al.
Radio maps reflect the spatial distribution of signal strength and are essential for applications like smart cities, IoT, and wireless network planning. However, reconstructing accurate radio maps from sparse measurements remains challenging. Traditional interpolation and inpainting methods lack environmental awareness, while many deep learning approaches depend on detailed scene data, limiting generalization. To address this, we propose MARS, a Multi-scale Aware Radiomap Super-resolution method that combines CNNs and Transformers with multi-scale feature fusion and residual connections. MARS focuses on both global and local feature extraction, enhancing feature representation across different receptive fields and improving reconstruction accuracy. Experiments across different scenes and antenna locations show that MARS outperforms baseline models in both MSE and SSIM, while maintaining low computational cost, demonstrating strong practical potential.
CVApr 1, 2025
Leveraging Contrast Information for Efficient Document Shadow RemovalYifan Liu, Jiancheng Huang, Na Liu et al.
Document shadows are a major obstacle in the digitization process. Due to the dense information in text and patterns covered by shadows, document shadow removal requires specialized methods. Existing document shadow removal methods, although showing some progress, still rely on additional information such as shadow masks or lack generalization and effectiveness across different shadow scenarios. This often results in incomplete shadow removal or loss of original document content and tones. Moreover, these methods tend to underutilize the information present in the original shadowed document image. In this paper, we refocus our approach on the document images themselves, which inherently contain rich information.We propose an end-to-end document shadow removal method guided by contrast representation, following a coarse-to-fine refinement approach. By extracting document contrast information, we can effectively and quickly locate shadow shapes and positions without the need for additional masks. This information is then integrated into the refined shadow removal process, providing better guidance for network-based removal and feature fusion. Extensive qualitative and quantitative experiments show that our method achieves state-of-the-art performance.
CLMar 14, 2024
Dial-insight: Fine-tuning Large Language Models with High-Quality Domain-Specific Data Preventing Capability CollapseJianwei Sun, Chaoyang Mei, Linlin Wei et al.
The efficacy of large language models (LLMs) is heavily dependent on the quality of the underlying data, particularly within specialized domains. A common challenge when fine-tuning LLMs for domain-specific applications is the potential degradation of the model's generalization capabilities. To address these issues, we propose a two-stage approach for the construction of production prompts designed to yield high-quality data. This method involves the generation of a diverse array of prompts that encompass a broad spectrum of tasks and exhibit a rich variety of expressions. Furthermore, we introduce a cost-effective, multi-dimensional quality assessment framework to ensure the integrity of the generated labeling data. Utilizing a dataset comprised of service provider and customer interactions from the real estate sector, we demonstrate a positive correlation between data quality and model performance. Notably, our findings indicate that the domain-specific proficiency of general LLMs can be enhanced through fine-tuning with data produced via our proposed method, without compromising their overall generalization abilities, even when exclusively domain-specific data is employed for fine-tuning.
CVMay 26, 2018
Fine-Grained Age Estimation in the wild with Attention LSTM NetworksKe Zhang, Na Liu, Xingfang Yuan et al.
Age estimation from a single face image has been an essential task in the field of human-computer interaction and computer vision, which has a wide range of practical application values. Accuracy of age estimation of face images in the wild is relatively low for existing methods, because they only take into account the global features, while neglecting the fine-grained features of age-sensitive areas. We propose a novel method based on our attention long short-term memory (AL) network for fine-grained age estimation in the wild, inspired by the fine-grained categories and the visual attention mechanism. This method combines the residual networks (ResNets) or the residual network of residual network (RoR) models with LSTM units to construct AL-ResNets or AL-RoR networks to extract local features of age-sensitive regions, which effectively improves the age estimation accuracy. First, a ResNets or a RoR model pretrained on ImageNet dataset is selected as the basic model, which is then fine-tuned on the IMDB-WIKI-101 dataset for age estimation. Then, we fine-tune the ResNets or the RoR on the target age datasets to extract the global features of face images. To extract the local features of age-sensitive regions, the LSTM unit is then presented to obtain the coordinates of the agesensitive region automatically. Finally, the age group classification is conducted directly on the Adience dataset, and age-regression experiments are performed by the Deep EXpectation algorithm (DEX) on MORPH Album 2, FG-NET and 15/16LAP datasets. By combining the global and the local features, we obtain our final prediction results. Experimental results illustrate the effectiveness and robustness of the proposed AL-ResNets or AL-RoR for age estimation in the wild, where it achieves better state-of-the-art performance than all other convolutional neural network.
RONov 21, 2017
Condition directed Multi-domain Adversarial Learning for Loop Closure DetectionPeng Yin, Yuqing He, Na Liu et al.
Loop closure detection (LCD) is the key module in appearance based simultaneously localization and mapping (SLAM). However, in the real life, the appearance of visual inputs are usually affected by the illumination changes and texture changes under different weather conditions. Traditional methods in LCD usually rely on handcraft features, however, such methods are unable to capture the common descriptions under different weather conditions, such as rainy, foggy and sunny. Furthermore, traditional handcraft features could not capture the highly level understanding for the local scenes. In this paper, we proposed a novel condition directed multi-domain adversarial learning method, where we use the weather condition as the direction for feature inference. Based on the generative adversarial networks (GANs) and a classification networks, the proposed method could extract the high-level weather-invariant features directly from the raw data. The only labels required here are the weather condition of each visual input. Experiments are conducted in the GTAV game simulator, which could generated lifelike outdoor scenes under different weather conditions. The performance of LCD results shows that our method outperforms the state-of-arts significantly.
NAAug 22, 2015
HFVS: An Arbitrary High Order Flux Vector Splitting MethodYibing Chen, Song Jiang, Na Liu
In this paper, a new scheme of arbitrary high order accuracy in both space and time is proposed to solve hyperbolic conservative laws. Based on the idea of flux vector splitting(FVS) scheme, we split all the space and time derivatives in the Taylor expansion of the numerical flux into two parts: one part with positive eigenvalues, another part with negative eigenvalues. According to a Lax-Wendroff procedure, all the time derivatives are then replaced by space derivatives. And the space derivatives is calculated by WENO reconstruction polynomial. One of the most important advantages of this new scheme is easy to implement.In addition, it should be pointed out, the procedure of calculating the space and time derivatives in numerical flux can be used as a building block to extend the current first order schemes to very high order accuracy in both space and time. Numerous numerical tests for linear and nonlinear hyperbolic conservative laws demonstrate that new scheme is robust and can be high order accuracy in both space and time.