Jingxin Xu

h-index12
2papers

2 Papers

CLFeb 8, 2025Code
Refining Positive and Toxic Samples for Dual Safety Self-Alignment of LLMs with Minimal Human Interventions

Jingxin Xu, Guoshun Nan, Sheng Guan et al.

Recent AI agents, such as ChatGPT and LLaMA, primarily rely on instruction tuning and reinforcement learning to calibrate the output of large language models (LLMs) with human intentions, ensuring the outputs are harmless and helpful. Existing methods heavily depend on the manual annotation of high-quality positive samples, while contending with issues such as noisy labels and minimal distinctions between preferred and dispreferred response data. However, readily available toxic samples with clear safety distinctions are often filtered out, removing valuable negative references that could aid LLMs in safety alignment. In response, we propose PT-ALIGN, a novel safety self-alignment approach that minimizes human supervision by automatically refining positive and toxic samples and performing fine-grained dual instruction tuning. Positive samples are harmless responses, while toxic samples deliberately contain extremely harmful content, serving as a new supervisory signals. Specifically, we utilize LLM itself to iteratively generate and refine training instances by only exploring fewer than 50 human annotations. We then employ two losses, i.e., maximum likelihood estimation (MLE) and fine-grained unlikelihood training (UT), to jointly learn to enhance the LLM's safety. The MLE loss encourages an LLM to maximize the generation of harmless content based on positive samples. Conversely, the fine-grained UT loss guides the LLM to minimize the output of harmful words based on negative samples at the token-level, thereby guiding the model to decouple safety from effectiveness, directing it toward safer fine-tuning objectives, and increasing the likelihood of generating helpful and reliable content. Experiments on 9 popular open-source LLMs demonstrate the effectiveness of our PT-ALIGN for safety alignment, while maintaining comparable levels of helpfulness and usefulness.

CVDec 6, 2016
Automatic Event Detection for Signal-based Surveillance

Jingxin Xu, Clinton Fookes, Sridha Sridharan

Signal-based Surveillance systems such as Closed Circuits Televisions (CCTV) have been widely installed in public places. Those systems are normally used to find the events with security interest, and play a significant role in public safety. Though such systems are still heavily reliant on human labour to monitor the captured information, there have been a number of automatic techniques proposed to analysing the data. This article provides an overview of automatic surveillance event detection techniques . Despite it's popularity in research, it is still too challenging a problem to be realised in a real world deployment. The challenges come from not only the detection techniques such as signal processing and machine learning, but also the experimental design with factors such as data collection, evaluation protocols, and ground-truth annotation. Finally, this article propose that multi-disciplinary research is the path towards a solution to this problem.