Chunjie Zhou

h-index30

7papers

45citations

Novelty35%

AI Score25

Ranked #163,871 of 194,257 authors (top 84%)#4,664 in CR (top 69%)

7 Papers

2.9CRSep 26, 2022Code

Neural-FacTOR: Neural Representation Learning for Website Fingerprinting Attack over TOR Anonymity

Haili Sun, Yan Huang, Lansheng Han et al.

TOR (The Onion Router) network is a widely used open source anonymous communication tool, the abuse of TOR makes it difficult to monitor the proliferation of online crimes such as to access criminal websites. Most existing approches for TOR network de-anonymization heavily rely on manually extracted features resulting in time consuming and poor performance. To tackle the shortcomings, this paper proposes a neural representation learning approach to recognize website fingerprint based on classification algorithm. We constructed a new website fingerprinting attack model based on convolutional neural network (CNN) with dilation and causal convolution, which can improve the perception field of CNN as well as capture the sequential characteristic of input data. Experiments on three mainstream public datasets show that the proposed model is robust and effective for the website fingerprint classification and improves the accuracy by 12.21% compared with the state-of-the-art methods.

2.3CRFeb 21, 2023

Few-shot Detection of Anomalies in Industrial Cyber-Physical System via Prototypical Network and Contrastive Learning

Haili Sun, Yan Huang, Lansheng Han et al.

The rapid development of Industry 4.0 has amplified the scope and destructiveness of industrial Cyber-Physical System (CPS) by network attacks. Anomaly detection techniques are employed to identify these attacks and guarantee the normal operation of industrial CPS. However, it is still a challenging problem to cope with scenarios with few labeled samples. In this paper, we propose a few-shot anomaly detection model (FSL-PN) based on prototypical network and contrastive learning for identifying anomalies with limited labeled data from industrial CPS. Specifically, we design a contrastive loss to assist the training process of the feature extractor and learn more fine-grained features to improve the discriminative performance. Subsequently, to tackle the overfitting issue during classifying, we construct a robust cost function with a specific regularizer to enhance the generalization capability. Experimental results based on two public imbalanced datasets with few-shot settings show that the FSL-PN model can significantly improve F1 score and reduce false alarm rate (FAR) for identifying anomalous signals to guarantee the security of industrial CPS.

9.2LGFeb 14, 2024

Research and application of Transformer based anomaly detection model: A literature review

Mingrui Ma, Lansheng Han, Chunjie Zhou

Transformer, as one of the most advanced neural network models in Natural Language Processing (NLP), exhibits diverse applications in the field of anomaly detection. To inspire research on Transformer-based anomaly detection, this review offers a fresh perspective on the concept of anomaly detection. We explore the current challenges of anomaly detection and provide detailed insights into the operating principles of Transformer and its variants in anomaly detection tasks. Additionally, we delineate various application scenarios for Transformer-based anomaly detection models and discuss the datasets and evaluation metrics employed. Furthermore, this review highlights the key challenges in Transformer-based anomaly detection research and conducts a comprehensive analysis of future research trends in this domain. The review includes an extensive compilation of over 100 core references related to Transformer-based anomaly detection. To the best of our knowledge, this is the first comprehensive review that focuses on the research related to Transformer in the context of anomaly detection. We hope that this paper can provide detailed technical information to researchers interested in Transformer-based anomaly detection tasks.

8.5CRApr 28, 2024

Research and application of artificial intelligence based webshell detection model: A literature review

Mingrui Ma, Lansheng Han, Chunjie Zhou

Webshell, as the "culprit" behind numerous network attacks, is one of the research hotspots in the field of cybersecurity. However, the complexity, stealthiness, and confusing nature of webshells pose significant challenges to the corresponding detection schemes. With the rise of Artificial Intelligence (AI) technology, researchers have started to apply different intelligent algorithms and neural network architectures to the task of webshell detection. However, the related research still lacks a systematic and standardized methodological process, which is confusing and redundant. Therefore, following the development timeline, we carefully summarize the progress of relevant research in this field, dividing it into three stages: Start Stage, Initial Development Stage, and In-depth Development Stage. We further elaborate on the main characteristics and core algorithms of each stage. In addition, we analyze the pain points and challenges that still exist in this field and predict the future development trend of this field from our point of view. To the best of our knowledge, this is the first review that details the research related to AI-based webshell detection. It is also hoped that this paper can provide detailed technical information for more researchers interested in AI-based webshell detection tasks.

5.8CRFeb 12, 2024Code

Large Language Models are Few-shot Generators: Proposing Hybrid Prompt Algorithm To Generate Webshell Escape Samples

Mingrui Ma, Lansheng Han, Chunjie Zhou

The frequent occurrence of cyber-attacks has made webshell attacks and defense gradually become a research hotspot in the field of network security. However, the lack of publicly available benchmark datasets and the over-reliance on manually defined rules for webshell escape sample generation have slowed down the progress of research related to webshell escape sample generation and artificial intelligence (AI)-based webshell detection. To address the drawbacks of weak webshell sample escape capabilities, the lack of webshell datasets with complex malicious features, and to promote the development of webshell detection, we propose the Hybrid Prompt algorithm for webshell escape sample generation with the help of large language models. As a prompt algorithm specifically developed for webshell sample generation, the Hybrid Prompt algorithm not only combines various prompt ideas including Chain of Thought, Tree of Thought, but also incorporates various components such as webshell hierarchical module and few-shot example to facilitate the LLM in learning and reasoning webshell escape strategies. Experimental results show that the Hybrid Prompt algorithm can work with multiple LLMs with excellent code reasoning ability to generate high-quality webshell samples with high Escape Rate (88.61% with GPT-4 model on VirusTotal detection engine) and (Survival Rate 54.98% with GPT-4 model).

2.6LGApr 12, 2024

HCL-MTSAD: Hierarchical Contrastive Consistency Learning for Accurate Detection of Industrial Multivariate Time Series Anomalies

Haili Sun, Yan Huang, Lansheng Han et al.

Multivariate Time Series (MTS) anomaly detection focuses on pinpointing samples that diverge from standard operational patterns, which is crucial for ensuring the safety and security of industrial applications. The primary challenge in this domain is to develop representations capable of discerning anomalies effectively. The prevalent methods for anomaly detection in the literature are predominantly reconstruction-based and predictive in nature. However, they typically concentrate on a single-dimensional instance level, thereby not fully harnessing the complex associations inherent in industrial MTS. To address this issue, we propose a novel self-supervised hierarchical contrastive consistency learning method for detecting anomalies in MTS, named HCL-MTSAD. It innovatively leverages data consistency at multiple levels inherent in industrial MTS, systematically capturing consistent associations across four latent levels-measurement, sample, channel, and process. By developing a multi-layer contrastive loss, HCL-MTSAD can extensively mine data consistency and spatio-temporal association, resulting in more informative representations. Subsequently, an anomaly discrimination module, grounded in self-supervised hierarchical contrastive learning, is designed to detect timestamp-level anomalies by calculating multi-scale data consistency. Extensive experiments conducted on six diverse MTS datasets retrieved from real cyber-physical systems and server machines, in comparison with 20 baselines, indicate that HCL-MTSAD's anomaly detection capability outperforms the state-of-the-art benchmark models by an average of 1.8\% in terms of F1 score.

2.6LGMar 5, 2024

Unsupervised Spatio-Temporal State Estimation for Fine-grained Adaptive Anomaly Diagnosis of Industrial Cyber-physical Systems

Haili Sun, Yan Huang, Lansheng Han et al.

Accurate detection and diagnosis of abnormal behaviors such as network attacks from multivariate time series (MTS) are crucial for ensuring the stable and effective operation of industrial cyber-physical systems (CPS). However, existing researches pay little attention to the logical dependencies among system working states, and have difficulties in explaining the evolution mechanisms of abnormal signals. To reveal the spatio-temporal association relationships and evolution mechanisms of the working states of industrial CPS, this paper proposes a fine-grained adaptive anomaly diagnosis method (i.e. MAD-Transformer) to identify and diagnose anomalies in MTS. MAD-Transformer first constructs a temporal state matrix to characterize and estimate the change patterns of the system states in the temporal dimension. Then, to better locate the anomalies, a spatial state matrix is also constructed to capture the inter-sensor state correlation relationships within the system. Subsequently, based on these two types of state matrices, a three-branch structure of series-temporal-spatial attention module is designed to simultaneously capture the series, temporal, and space dependencies among MTS. Afterwards, three associated alignment loss functions and a reconstruction loss are constructed to jointly optimize the model. Finally, anomalies are determined and diagnosed by comparing the residual matrices with the original matrices. We conducted comparative experiments on five publicly datasets spanning three application domains (service monitoring, spatial and earth exploration, and water treatment), along with a petroleum refining simulation dataset collected by ourselves. The results demonstrate that MAD-Transformer can adaptively detect fine-grained anomalies with short duration, and outperforms the state-of-the-art baselines in terms of noise robustness and localization performance.