Tianyu Cui

h-index9

11papers

348citations

Novelty46%

AI Score42

Ranked #59,350 of 194,257 authors (top 31%)#123 in NI (top 20%)

11 Papers

2.3NIApr 21, 2022Code

6GAN: IPv6 Multi-Pattern Target Generation via Generative Adversarial Nets with Reinforcement Learning

Tianyu Cui, Gaopeng Gou, Gang Xiong et al.

Global IPv6 scanning has always been a challenge for researchers because of the limited network speed and computational power. Target generation algorithms are recently proposed to overcome the problem for Internet assessments by predicting a candidate set to scan. However, IPv6 custom address configuration emerges diverse addressing patterns discouraging algorithmic inference. Widespread IPv6 alias could also mislead the algorithm to discover aliased regions rather than valid host targets. In this paper, we introduce 6GAN, a novel architecture built with Generative Adversarial Net (GAN) and reinforcement learning for multi-pattern target generation. 6GAN forces multiple generators to train with a multi-class discriminator and an alias detector to generate non-aliased active targets with different addressing pattern types. The rewards from the discriminator and the alias detector help supervise the address sequence decision-making process. After adversarial training, 6GAN's generators could keep a strong imitating ability for each pattern and 6GAN's discriminator obtains outstanding pattern discrimination ability with a 0.966 accuracy. Experiments indicate that our work outperformed the state-of-the-art target generation algorithms by reaching a higher-quality candidate set.

2.3NIApr 20, 2022

6GCVAE: Gated Convolutional Variational Autoencoder for IPv6 Target Generation

Tianyu Cui, Gaopeng Gou, Gang Xiong

IPv6 scanning has always been a challenge for researchers in the field of network measurement. Due to the considerable IPv6 address space, while recent network speed and computational power have been improved, using a brute-force approach to probe the entire network space of IPv6 is almost impossible. Systems are required an algorithmic approach to generate more possible active target candidate sets to probe. In this paper, we first try to use deep learning to design such IPv6 target generation algorithms. The model effectively learns the address structure by stacking the gated convolutional layer to construct Variational Autoencoder (VAE). We also introduce two address classification methods to improve the model effect of the target generation. Experiments indicate that our approach 6GCVAE outperformed the conventional VAE models and the state-of-the-art target generation algorithm in two active address datasets.

6.6CLJul 2, 2024Code

LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis

Tianyu Cui, Shiyu Ma, Ziang Chen et al.

Log analysis is crucial for ensuring the orderly and stable operation of information systems, particularly in the field of Artificial Intelligence for IT Operations (AIOps). Large Language Models (LLMs) have demonstrated significant potential in natural language processing tasks. In the AIOps domain, they excel in tasks such as anomaly detection, root cause analysis of faults, operations and maintenance script generation, and alert information summarization. However, the performance of current LLMs in log analysis tasks remains inadequately validated. To address this gap, we introduce LogEval, a comprehensive benchmark suite designed to evaluate the capabilities of LLMs in various log analysis tasks for the first time. This benchmark covers tasks such as log parsing, log anomaly detection, log fault diagnosis, and log summarization. LogEval evaluates each task using 4,000 publicly available log data entries and employs 15 different prompts for each task to ensure a thorough and fair assessment. By rigorously evaluating leading LLMs, we demonstrate the impact of various LLM technologies on log analysis performance, focusing on aspects such as self-consistency and few-shot contextual learning. We also discuss findings related to model quantification, Chinese-English question-answering evaluation, and prompt engineering. These findings provide insights into the strengths and weaknesses of LLMs in multilingual environments and the effectiveness of different prompt strategies. Various evaluation methods are employed for different tasks to accurately measure the performance of LLMs in log analysis, ensuring a comprehensive assessment. The insights gained from LogEvals evaluation reveal the strengths and limitations of LLMs in log analysis tasks, providing valuable guidance for researchers and practitioners.

8.7CRApr 20, 2022Code

SiamHAN: IPv6 Address Correlation Attacks on TLS Encrypted Traffic via Siamese Heterogeneous Graph Attention Network

Tianyu Cui, Gaopeng Gou, Gang Xiong et al.

Unlike IPv4 addresses, which are typically masked by a NAT, IPv6 addresses could easily be correlated with user activity, endangering their privacy. Mitigations to address this privacy concern have been deployed, making existing approaches for address-to-user correlation unreliable. This work demonstrates that an adversary could still correlate IPv6 addresses with users accurately, even with these protection mechanisms. To do this, we propose an IPv6 address correlation model - SiamHAN. The model uses a Siamese Heterogeneous Graph Attention Network to measure whether two IPv6 client addresses belong to the same user even if the user's traffic is protected by TLS encryption. Using a large real-world dataset, we show that, for the tasks of tracking target users and discovering unique users, the state-of-the-art techniques could achieve only 85% and 60% accuracy, respectively. However, SiamHAN exhibits 99% and 88% accuracy.

21.9CLJan 11, 2024

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

Tianyu Cui, Yanling Wang, Chuanpu Fu et al.

Large language models (LLMs) have strong capabilities in solving diverse natural language processing tasks. However, the safety and security issues of LLM systems have become the major obstacle to their widespread application. Many studies have extensively investigated risks in LLM systems and developed the corresponding mitigation strategies. Leading-edge enterprises such as OpenAI, Google, Meta, and Anthropic have also made lots of efforts on responsible LLMs. Therefore, there is a growing need to organize the existing studies and establish comprehensive taxonomies for the community. In this paper, we delve into four essential modules of an LLM system, including an input module for receiving prompts, a language model trained on extensive corpora, a toolchain module for development and deployment, and an output module for exporting LLM-generated content. Based on this, we propose a comprehensive taxonomy, which systematically analyzes potential risks associated with each module of an LLM system and discusses the corresponding mitigation strategies. Furthermore, we review prevalent benchmarks, aiming to facilitate the risk assessment of LLM systems. We hope that this paper can help LLM participants embrace a systematic perspective to build their responsible LLM systems.

22.0LGApr 5, 2025Code

TrafficLLM: Enhancing Large Language Models for Network Traffic Analysis with Generic Traffic Representation

Tianyu Cui, Xinjie Lin, Sijia Li et al.

Machine learning (ML) powered network traffic analysis has been widely used for the purpose of threat detection. Unfortunately, their generalization across different tasks and unseen data is very limited. Large language models (LLMs), known for their strong generalization capabilities, have shown promising performance in various domains. However, their application to the traffic analysis domain is limited due to significantly different characteristics of network traffic. To address the issue, in this paper, we propose TrafficLLM, which introduces a dual-stage fine-tuning framework to learn generic traffic representation from heterogeneous raw traffic data. The framework uses traffic-domain tokenization, dual-stage tuning pipeline, and extensible adaptation to help LLM release generalization ability on dynamic traffic analysis tasks, such that it enables traffic detection and traffic generation across a wide range of downstream tasks. We evaluate TrafficLLM across 10 distinct scenarios and 229 types of traffic. TrafficLLM achieves F1-scores of 0.9875 and 0.9483, with up to 80.12% and 33.92% better performance than existing detection and generation methods. It also shows strong generalization on unseen traffic with an 18.6% performance improvement. We further evaluate TrafficLLM in real-world scenarios. The results confirm that TrafficLLM is easy to scale and achieves accurate detection performance on enterprise traffic.

11.8CVJan 7, 2025

Evaluating Image Caption via Cycle-consistent Text-to-Image Generation

Tianyu Cui, Jinbin Bai, Guo-Hua Wang et al.

Evaluating image captions typically relies on reference captions, which are costly to obtain and exhibit significant diversity and subjectivity. While reference-free evaluation metrics have been proposed, most focus on cross-modal evaluation between captions and images. Recent research has revealed that the modality gap generally exists in the representation of contrastive learning-based multi-modal systems, undermining the reliability of cross-modality metrics like CLIPScore. In this paper, we propose CAMScore, a cyclic reference-free automatic evaluation metric for image captioning models. To circumvent the aforementioned modality gap, CAMScore utilizes a text-to-image model to generate images from captions and subsequently evaluates these generated images against the original images. Furthermore, to provide fine-grained information for a more comprehensive evaluation, we design a three-level evaluation framework for CAMScore that encompasses pixel-level, semantic-level, and objective-level perspectives. Extensive experiment results across multiple benchmark datasets show that CAMScore achieves a superior correlation with human judgments compared to existing reference-based and reference-free metrics, demonstrating the effectiveness of the framework.

15.5CVNov 28, 2025

Ovis-Image Technical Report

Guo-Hua Wang, Liangfu Cao, Tianyu Cui et al.

We introduce $\textbf{Ovis-Image}$, a 7B text-to-image model specifically optimized for high-quality text rendering, designed to operate efficiently under stringent computational constraints. Built upon our previous Ovis-U1 framework, Ovis-Image integrates a diffusion-based visual decoder with the stronger Ovis 2.5 multimodal backbone, leveraging a text-centric training pipeline that combines large-scale pre-training with carefully tailored post-training refinements. Despite its compact architecture, Ovis-Image achieves text rendering performance on par with significantly larger open models such as Qwen-Image and approaches closed-source systems like Seedream and GPT4o. Crucially, the model remains deployable on a single high-end GPU with moderate memory, narrowing the gap between frontier-level text rendering and practical deployment. Our results indicate that combining a strong multimodal backbone with a carefully designed, text-focused training recipe is sufficient to achieve reliable bilingual text rendering without resorting to oversized or proprietary models.

2.3NIAug 5, 2020

6VecLM: Language Modeling in Vector Space for IPv6 Target Generation

Tianyu Cui, Gang Xiong, Gaopeng Gou et al.

Fast IPv6 scanning is challenging in the field of network measurement as it requires exploring the whole IPv6 address space but limited by current computational power. Researchers propose to obtain possible active target candidate sets to probe by algorithmically analyzing the active seed sets. However, IPv6 addresses lack semantic information and contain numerous addressing schemes, leading to the difficulty of designing effective algorithms. In this paper, we introduce our approach 6VecLM to explore achieving such target generation algorithms. The architecture can map addresses into a vector space to interpret semantic relationships and uses a Transformer network to build IPv6 language models for predicting address sequence. Experiments indicate that our approach can perform semantic classification on address space. By adding a new generation approach, our model possesses a controllable word innovation capability compared to conventional language models. The work outperformed the state-of-the-art target generation algorithms on two active address datasets by reaching more quality candidate sets.

9.0MLFeb 24, 2020

Informative Bayesian Neural Network Priors for Weak Signals

Tianyu Cui, Aki Havulinna, Pekka Marttinen et al.

Encoding domain knowledge into the prior over the high-dimensional weight space of a neural network is challenging but essential in applications with limited data and weak signals. Two types of domain knowledge are commonly available in scientific applications: 1. feature sparsity (fraction of features deemed relevant); 2. signal-to-noise ratio, quantified, for instance, as the proportion of variance explained (PVE). We show how to encode both types of domain knowledge into the widely used Gaussian scale mixture priors with Automatic Relevance Determination. Specifically, we propose a new joint prior over the local (i.e., feature-specific) scale parameters that encodes knowledge about feature sparsity, and a Stein gradient optimization to tune the hyperparameters in such a way that the distribution induced on the model's PVE matches the prior distribution. We show empirically that the new prior improves prediction accuracy, compared to existing neural network priors, on several publicly available datasets and in a genetics application where signals are weak and sparse, often outperforming even computationally intensive cross-validation for hyperparameter tuning.

15.6LGJan 24, 2019Code

Learning Global Pairwise Interactions with Bayesian Neural Networks

Tianyu Cui, Pekka Marttinen, Samuel Kaski

Estimating global pairwise interaction effects, i.e., the difference between the joint effect and the sum of marginal effects of two input features, with uncertainty properly quantified, is centrally important in science applications. We propose a non-parametric probabilistic method for detecting interaction effects of unknown form. First, the relationship between the features and the output is modelled using a Bayesian neural network, capable of representing complex interactions and principled uncertainty. Second, interaction effects and their uncertainty are estimated from the trained model. For the second step, we propose an intuitive global interaction measure: Bayesian Group Expected Hessian (GEH), which aggregates information of local interactions as captured by the Hessian. GEH provides a natural trade-off between type I and type II error and, moreover, comes with theoretical guarantees ensuring that the estimated interaction effects and their uncertainty can be improved by training a more accurate BNN. The method empirically outperforms available non-probabilistic alternatives on simulated and real-world data. Finally, we demonstrate its ability to detect interpretable interactions between higher-level features (at deeper layers of the neural network).