CLApr 16, 2024Code
DESTEIN: Navigating Detoxification of Language Models via Universal Steering Pairs and Head-wise Activation FusionYu Li, Han Jiang, Chuanyang Gong et al.
Despite the remarkable achievements of language models (LMs) across a broad spectrum of tasks, their propensity for generating toxic outputs remains a prevalent concern. Current solutions involving finetuning or auxiliary models usually require extensive computational resources, hindering their practicality in large language models (LLMs). In this paper, we propose DeStein, a novel method that detoxifies LMs by applying representation engineering in activation spaces with lower resource and time costs. Specifically, we derive detoxification vectors from self-induced, universal steering pairs through arithmetic operations in activation spaces. During inference, detoxification is achieved by fusing the detoxification vectors with the original representations in a head-wise manner. Empirical results demonstrate that our method significantly outperforms previous state-of-the-art approaches on various metrics, while also maintaining satisfactory generation quality and diversity. We further validate the practicality and scalability of DeStein with a series of white-box LLMs. The method is open-sourced at https://github.com/LizLizLi/DeStein. Warning: Some example model outputs may contain highly offensive or disturbing text.
CVApr 1, 2024
Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental LearningJian Jiao, Yu Dai, Hefei Mei et al.
Recent video class-incremental learning usually excessively pursues the accuracy of the newly seen classes and relies on memory sets to mitigate catastrophic forgetting of the old classes. However, limited storage only allows storing a few representative videos. So we propose SNRO, which slightly shifts the features of new classes to remember old classes. Specifically, SNRO contains Examples Sparse(ES) and Early Break(EB). ES decimates at a lower sample rate to build memory sets and uses interpolation to align those sparse frames in the future. By this, SNRO stores more examples under the same memory consumption and forces the model to focus on low-semantic features which are harder to be forgotten. EB terminates the training at a small epoch, preventing the model from overstretching into the high-semantic space of the current task. Experiments on UCF101, HMDB51, and UESTC-MMEA-CL datasets show that SNRO performs better than other approaches while consuming the same memory consumption.
LGApr 16, 2019
DSTP-RNN: a dual-stage two-phase attention-based recurrent neural networks for long-term and multivariate time series predictionYeqi Liu, Chuanyang Gong, Ling Yang et al.
Long-term prediction of multivariate time series is still an important but challenging problem. The key to solve this problem is to capture the spatial correlations at the same time, the spatio-temporal relationships at different times and the long-term dependence of the temporal relationships between different series. Attention-based recurrent neural networks (RNN) can effectively represent the dynamic spatio-temporal relationships between exogenous series and target series, but it only performs well in one-step time prediction and short-term time prediction. In this paper, inspired by human attention mechanism including the dual-stage two-phase (DSTP) model and the influence mechanism of target information and non-target information, we propose DSTP-based RNN (DSTP-RNN) and DSTP-RNN-2 respectively for long-term time series prediction. Specifically, we first propose the DSTP-based structure to enhance the spatial correlations between exogenous series. The first phase produces violent but decentralized response weight, while the second phase leads to stationary and concentrated response weight. Secondly, we employ multiple attentions on target series to boost the long-term dependence. Finally, we study the performance of deep spatial attention mechanism and provide experiment and interpretation. Our methods outperform nine baseline methods on four datasets in the fields of energy, finance, environment and medicine, respectively.
CVOct 9, 2018
Real time expert system for anomaly detection of aerators based on computer vision technology and existing surveillance camerasYeqi Liu, Yingyi Chen, Huihui Yu et al.
Aerators are essential and crucial auxiliary devices in intensive culture, especially in industrial culture in China. The traditional methods cannot accurately detect abnormal condition of aerators in time. Surveillance cameras are widely used as visual perception modules of the Internet of Things, and then using these widely existing surveillance cameras to realize real-time anomaly detection of aerators is a cost-free and easy-to-promote method. However, it is difficult to develop such an expert system due to some technical and applied challenges, e.g., illumination, occlusion, complex background, etc. To tackle these aforementioned challenges, we propose a real-time expert system based on computer vision technology and existing surveillance cameras for anomaly detection of aerators, which consists of two modules, i.e., object region detection and working state detection. First, it is difficult to detect the working state for some small object regions in whole images, and the time complexity of global feature comparison is also high, so we present an object region detection method based on the region proposal idea. Moreover, we propose a novel algorithm called reference frame Kanade-Lucas-Tomasi (RF-KLT) algorithm for motion feature extraction in fixed regions. Then, we present a dimension reduction method of time series for establishing a feature dataset with obvious boundaries between classes. Finally, we use machine learning algorithms to build the feature classifier. The experimental results in both the actual video dataset and the augmented video dataset show that the accuracy for detecting object region and working state of aerators is 100% and 99.9% respectively, and the detection speed is 77-333 frames per second (FPS) according to the different types of surveillance cameras.