93.1CLMar 13
Neuron-Aware Data Selection In Instruction Tuning For Large Language ModelsXin Chen, Junchao Wu, Shu Yang et al.
Instruction Tuning (IT) has been proven to be an effective approach to unlock the powerful capabilities of large language models (LLMs). Recent studies indicate that excessive IT data can degrade LLMs performance, while carefully selecting a small subset of high-quality IT data can significantly enhance their capabilities. Therefore, identifying the most efficient subset data from the IT dataset to effectively develop either specific or general abilities in LLMs has become a critical challenge. To address this, we propose a novel and efficient framework called NAIT. NAIT evaluates the impact of IT data on LLMs performance by analyzing the similarity of neuron activation patterns between the IT dataset and the target domain capability. Specifically, NAIT captures neuron activation patterns from in-domain datasets of target domain capabilities to construct reusable and transferable neuron activation features. It then evaluates and selects optimal samples based on the similarity between candidate samples and the expected activation features of the target capabilities. Experimental results show that training on the 10\% Alpaca-GPT4 IT data subset selected by NAIT consistently outperforms methods that rely on external advanced models or uncertainty-based features across various tasks. Our findings also reveal the transferability of neuron activation features across different capabilities of LLMs. In particular, IT data with more logical reasoning and programmatic features possesses strong general transferability, enabling models to develop stronger capabilities across multiple tasks, while a stable core subset of data is sufficient to consistently activate fundamental model capabilities and universally improve performance across diverse tasks.
64.4CLMay 15
DetectRL-X: Towards Reliable Multilingual and Real-World LLM-Generated Text DetectionJunchao Wu, Yefeng Liu, Chenyu Zhu et al.
The effective detection and governance of Large Language Model (LLM) generated content has become increasingly critical due to the growing risk of misuse. Despite the impressive performance of existing detectors, their reliability and potential in multilingual, real-world scenarios remain largely underexplored. In this study, we introduce DetectRL-X, a comprehensive multilingual benchmark designed to evaluate advanced detectors across 8 dimensions. The benchmark encompasses 8 languages commonly used in commercial contexts and collects human-written texts from 6 domains highly susceptible to LLM misuse. To better aligned with real-world applications, We create LLM-generated texts using 4 popular commercial LLMs, and include typical AI-assisted writing operations such as polishing, expanding, and condensing to capture authentic usage patterns. Furthermore, we develop a multilingual framework for paraphrasing and perturbation attacks to simulate diverse human modifications and writing noise, enabling stress testing of detectors across languages. Experimental results on DetectRL-X reveal the strengths and limitations of current state-of-the-art detectors when applied to diverse linguistic resources. We further analyze how domains, generators, attack strategies, text length, and refinement operations influence performance in different languages, underscoring DetectRL-X as an effective benchmark for strengthening multilingual and language-specific detectors.
CLAug 18, 2025Code
RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation PatternsXin Chen, Junchao Wu, Shu Yang et al.
Detecting content generated by large language models (LLMs) is crucial for preventing misuse and building trustworthy AI systems. Although existing detection methods perform well, their robustness in out-of-distribution (OOD) scenarios is still lacking. In this paper, we hypothesize that, compared to features used by existing detection methods, the internal representations of LLMs contain more comprehensive and raw features that can more effectively capture and distinguish the statistical pattern differences between LLM-generated texts (LGT) and human-written texts (HWT). We validated this hypothesis across different LLMs and observed significant differences in neural activation patterns when processing these two types of texts. Based on this, we propose RepreGuard, an efficient statistics-based detection method. Specifically, we first employ a surrogate model to collect representation of LGT and HWT, and extract the distinct activation feature that can better identify LGT. We can classify the text by calculating the projection score of the text representations along this feature direction and comparing with a precomputed threshold. Experimental results show that RepreGuard outperforms all baselines with average 94.92% AUROC on both in-distribution (ID) and OOD scenarios, while also demonstrating robust resilience to various text sizes and mainstream attacks. Data and code are publicly available at: https://github.com/NLP2CT/RepreGuard
CLFeb 18, 2025
Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing InducementsShu Yang, Shenzhe Zhu, Zeyu Wu et al.
We introduce Fraud-R1, a benchmark designed to evaluate LLMs' ability to defend against internet fraud and phishing in dynamic, real-world scenarios. Fraud-R1 comprises 8,564 fraud cases sourced from phishing scams, fake job postings, social media, and news, categorized into 5 major fraud types. Unlike previous benchmarks, Fraud-R1 introduces a multi-round evaluation pipeline to assess LLMs' resistance to fraud at different stages, including credibility building, urgency creation, and emotional manipulation. Furthermore, we evaluate 15 LLMs under two settings: 1. Helpful-Assistant, where the LLM provides general decision-making assistance, and 2. Role-play, where the model assumes a specific persona, widely used in real-world agent-based interactions. Our evaluation reveals the significant challenges in defending against fraud and phishing inducement, especially in role-play settings and fake job postings. Additionally, we observe a substantial performance gap between Chinese and English, underscoring the need for improved multilingual fraud detection capabilities.
ROOct 16, 2021
Learning Cloth Folding Tasks with Refined Flow Based Spatio-Temporal GraphsPeng Zhou, Omar Zahra, Anqing Duan et al.
Cloth folding is a widespread domestic task that is seemingly performed by humans but which is highly challenging for autonomous robots to execute due to the highly deformable nature of textiles; It is hard to engineer and learn manipulation pipelines to efficiently execute it. In this paper, we propose a new solution for robotic cloth folding (using a standard folding board) via learning from demonstrations. Our demonstration video encoding is based on a high-level abstraction, namely, a refined optical flow-based spatiotemporal graph, as opposed to a low-level encoding such as image pixels. By constructing a new spatiotemporal graph with an advanced visual corresponding descriptor, the policy learning can focus on key points and relations with a 3D spatial configuration, which allows to quickly generalize across different environments. To further boost the policy searching, we combine optical flow and static motion saliency maps to discriminate the dominant motions for better handling the system dynamics in real-time, which aligns with the attentional motion mechanism that dominates the human imitation process. To validate the proposed approach, we analyze the manual folding procedure and developed a custom-made end-effector to efficiently interact with the folding board. Multiple experiments on a real robotic platform were conducted to validate the effectiveness and robustness of the proposed method.
ROJul 30, 2021
A Novel Approach to Model the Kinematics of Human Fingers Based on an Elliptic Multi-Joint ConfigurationZeyu Wu, Luiza Labazanova, Peng Zhou et al.
In this paper, we present a novel kinematic model of the human phalanges based on the elliptical motion of their joints. The presence of the soft elastic tissues and the general anatomical structure of the hand joints highly affect the relative movement of the bones. Commonly used assumption of circular trajectories simplifies the designing process but leads to divergence with the actual hand behavior. The advantages of the proposed model are demonstrated through the comparison with the conventional revolute joint model. Conducted simulations and experiments validate designed forward and inverse kinematic algorithms. Obtained results show a high performance of the model in mimicking the human fingertip motion trajectory.
ROMar 17, 2021
Bio-Inspired Design of Artificial Striated Muscles Composed of Sarcomere-Like Contraction Units (preprint)Luiza Labazanova, Zeyu Wu, Zhengping Gu et al.
Biological muscles have always attracted robotics researchers due to their efficient capabilities in compliance, force generation, and mechanical work. Many groups are working on the development of artificial muscles, however, state-of-the-art methods still fall short in performance when compared with their biological counterpart. Muscles with high force output are mostly rigid, whereas traditional soft actuators take much space and are limited in strength and producing displacement. In this work, we aim to find a reasonable trade-off between these features by mimicking the striated structure of skeletal muscles. For that, we designed an artificial pneumatic myofibril composed of multiple contraction units that combine stretchable and inextensible materials. Varying the geometric parameters and the number of units in series provides flexible adjustment of the desired muscle operation. We derived a mathematical model that predicts the relationship between the input pneumatic pressure and the generated output force. A detailed experimental study is conducted to validate the performance of the proposed bio-inspired muscle.