LGOct 19, 2022
Training set cleansing of backdoor poisoning by self-supervised representation learningH. Wang, S. Karami, O. Dia et al.
A backdoor or Trojan attack is an important type of data poisoning attack against deep neural network (DNN) classifiers, wherein the training dataset is poisoned with a small number of samples that each possess the backdoor pattern (usually a pattern that is either imperceptible or innocuous) and which are mislabeled to the attacker's target class. When trained on a backdoor-poisoned dataset, a DNN behaves normally on most benign test samples but makes incorrect predictions to the target class when the test sample has the backdoor pattern incorporated (i.e., contains a backdoor trigger). Here we focus on image classification tasks and show that supervised training may build stronger association between the backdoor pattern and the associated target class than that between normal features and the true class of origin. By contrast, self-supervised representation learning ignores the labels of samples and learns a feature embedding based on images' semantic content. %We thus propose to use unsupervised representation learning to avoid emphasising backdoor-poisoned training samples and learn a similar feature embedding for samples of the same class. Using a feature embedding found by self-supervised representation learning, a data cleansing method, which combines sample filtering and re-labeling, is developed. Experiments on CIFAR-10 benchmark datasets show that our method achieves state-of-the-art performance in mitigating backdoor attacks.
NADec 2, 2018
The Wasserstein-Fisher-Rao metric for waveform based earthquake locationD. T. Zhou, J. Chen, H. Wu et al.
In our previous work [Chen el al., J. Comput. Phys., 373(2018)], the quadratic Wasserstein metric is successfully applied to the earthquake location problem. The actual earthquake hypocenter can be accurately recovered starting from initial values very far from the true ones. However, the seismic wave signals need to be normalized since the quadratic Wasserstein metric requires mass conservation. This brings a critical difficulty. Since the amplitude of a seismogram at a receiver is a good representation of the distance between the source and the receiver, simply normalizing the signals will cause the objective function in optimization process to be insensitive to the distance between the source and the receiver. When the data is contaminated with strong noise, the minimum point of the objective function will deviate and lead to a low accurate location result. To overcome the difficulty mentioned above, we apply the Wasserstein-Fisher-Rao (WFR) metric [Chizat et al., Found. Comput. Math., 18(2018)] to the earthquake location problem. The WFR metric is one of the newly developed metric in the unbalanced Optimal Transport theory. It does not require the normalization of the seismic signals. Thus, the amplitude of seismograms can be considered as a new constraint, which can substantially improve the sensitivity of the objective function to the distance between the source and the receiver. As a result, we can expect more accurate location results from the WFR metric based method compare to those based on quadratic Wasserstein metric under high-intensity noise. The numerical examples also demonstrate this.
CVJul 15, 2025
NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video GenerationX. Feng, H. Yu, M. Wu et al.
With the rapid development of foundation video generation technologies, long video generation models have exhibited promising research potential thanks to expanded content creation space. Recent studies reveal that the goal of long video generation tasks is not only to extend video duration but also to accurately express richer narrative content within longer videos. However, due to the lack of evaluation benchmarks specifically designed for long video generation models, the current assessment of these models primarily relies on benchmarks with simple narrative prompts (e.g., VBench). To the best of our knowledge, our proposed NarrLV is the first benchmark to comprehensively evaluate the Narrative expression capabilities of Long Video generation models. Inspired by film narrative theory, (i) we first introduce the basic narrative unit maintaining continuous visual presentation in videos as Temporal Narrative Atom (TNA), and use its count to quantitatively measure narrative richness. Guided by three key film narrative elements influencing TNA changes, we construct an automatic prompt generation pipeline capable of producing evaluation prompts with a flexibly expandable number of TNAs. (ii) Then, based on the three progressive levels of narrative content expression, we design an effective evaluation metric using the MLLM-based question generation and answering framework. (iii) Finally, we conduct extensive evaluations on existing long video generation models and the foundation generation models. Experimental results demonstrate that our metric aligns closely with human judgments. The derived evaluation outcomes reveal the detailed capability boundaries of current video generation models in narrative content expression.
SYAug 18, 2021
Nonlinear Autoregression with Convergent Dynamics on Novel Computational PlatformsJ. Chen, H. I. Nurdin
Nonlinear stochastic modeling is useful for describing complex engineering systems. Meanwhile, neuromorphic (brain-inspired) computing paradigms are developing to tackle tasks that are challenging and resource intensive on digital computers. An emerging scheme is reservoir computing which exploits nonlinear dynamical systems for temporal information processing. This paper introduces reservoir computers with output feedback as stationary and ergodic infinite-order nonlinear autoregressive models. We highlight the versatility of this approach by employing classical and quantum reservoir computers to model synthetic and real data sets, further exploring their potential for control applications.
LGMay 25, 2021
Boosting-GNN: Boosting Algorithm for Graph Networks on Imbalanced Node ClassificationS. Shi, Kai Qiao, Shuai Yang et al.
The Graph Neural Network (GNN) has been widely used for graph data representation. However, the existing researches only consider the ideal balanced dataset, and the imbalanced dataset is rarely considered. Traditional methods such as resampling, reweighting, and synthetic samples that deal with imbalanced datasets are no longer applicable in GNN. This paper proposes an ensemble model called Boosting-GNN, which uses GNNs as the base classifiers during boosting. In Boosting-GNN, higher weights are set for the training samples that are not correctly classified by the previous classifier, thus achieving higher classification accuracy and better reliability. Besides, transfer learning is used to reduce computational cost and increase fitting ability. Experimental results indicate that the proposed Boosting-GNN model achieves better performance than GCN, GraphSAGE, GAT, SGC, N-GCN, and most advanced reweighting and resampling methods on synthetic imbalanced datasets, with an average performance improvement of 4.5%
IRAug 13, 2020
A Comprehensive Pipeline for Hotel Recommendation SystemJ. Chen, Z. Gao
This paper addresses a comprehensive pipeline to build a hotel recommendation system with the raw data collected by Apps in users' smartphones. The pipeline mainly consists of pre-processing of the raw data and training prediction models. We use two methods, Support Vector Machine (SVM) and Recurrent Neural Network (RNN). The results show that two methods achieved a reasonable accuracy with the pre-processing of the raw data. Therefore, we conclude that this paper provides a comprehensive pipeline, in which a hotel recommendation system was successfully built from the raw data to specific applications.
IRDec 20, 2014
Semantic Modelling with Long-Short-Term Memory for Information RetrievalH. Palangi, L. Deng, Y. Shen et al.
In this paper we address the following problem in web document and information retrieval (IR): How can we use long-term context information to gain better IR performance? Unlike common IR methods that use bag of words representation for queries and documents, we treat them as a sequence of words and use long short term memory (LSTM) to capture contextual dependencies. To the best of our knowledge, this is the first time that LSTM is applied to information retrieval tasks. Unlike training traditional LSTMs, the training strategy is different due to the special nature of information retrieval problem. Experimental evaluation on an IR task derived from the Bing web search demonstrates the ability of the proposed method in addressing both lexical mismatch and long-term context modelling issues, thereby, significantly outperforming existing state of the art methods for web document retrieval task.