Dan Lin

CV
h-index21
15papers
794citations
Novelty48%
AI Score50

15 Papers

CRApr 14Code
UniDetect: LLM-Driven Universal Fraud Detection across Heterogeneous Blockchains

Shuyi Miao, Wangjie Qiu, Shengda Zhuo et al.

As cross-chain interoperability advances, decentralized finance (DeFi) protocols enable illicit funds to be reorganized into uniform liquid assets that flow throughout the cryptocurrency market. Such operations can bypass monitoring targeted at individual blockchains and thereby weaken current regulatory frameworks. Motivated by these, we introduce UniDetect, a multi-chain cryptocurrency fraud account detection method based on large language models (LLMs). Specifically, we use domain knowledge to guide the LLM to generate general transaction summary texts applicable to heterogeneous blockchain accounts, which serve as evidence for fraud account detection. Furthermore, we introduce a two-stage alternating training strategy to continuously and dynamically enhance the multimodal joint reasoning for detecting fraudulent accounts based on both the textual evidence and the transaction graph patterns. Experiments on multiple blockchains show that UniDetect outperforms existing methods 5.57% to 7.58% in Kolmogorov-Smirnov (KS). For cross-chain zero-shot detection, UniDetect identifies over 94.58% of fraudulent accounts. It also generalizes well to non-blockchain data, delivering a 6.06% improvement in F1 over existing methods. The dataset and source code are available at https://github.com/msy0513/UniDetect.

CRSep 9, 2022
Defend Data Poisoning Attacks on Voice Authentication

Ke Li, Cameron Baird, Dan Lin

With the advances in deep learning, speaker verification has achieved very high accuracy and is gaining popularity as a type of biometric authentication option in many scenes of our daily life, especially the growing market of web services. Compared to traditional passwords, "vocal passwords" are much more convenient as they relieve people from memorizing different passwords. However, new machine learning attacks are putting these voice authentication systems at risk. Without a strong security guarantee, attackers could access legitimate users' web accounts by fooling the deep neural network (DNN) based voice recognition models. In this paper, we demonstrate an easy-to-implement data poisoning attack to the voice authentication system, which can hardly be captured by existing defense mechanisms. Thus, we propose a more robust defense method, called Guardian, which is a convolutional neural network-based discriminator. The Guardian discriminator integrates a series of novel techniques including bias reduction, input augmentation, and ensemble learning. Our approach is able to distinguish about 95% of attacked accounts from normal accounts, which is much more effective than existing approaches with only 60% accuracy.

SPApr 26, 2023
An EEG Channel Selection Framework for Driver Drowsiness Detection via Interpretability Guidance

Xinliang Zhou, Dan Lin, Ziyu Jia et al.

Drowsy driving has a crucial influence on driving safety, creating an urgent demand for driver drowsiness detection. Electroencephalogram (EEG) signal can accurately reflect the mental fatigue state and thus has been widely studied in drowsiness monitoring. However, the raw EEG data is inherently noisy and redundant, which is neglected by existing works that just use single-channel EEG data or full-head channel EEG data for model training, resulting in limited performance of driver drowsiness detection. In this paper, we are the first to propose an Interpretability-guided Channel Selection (ICS) framework for the driver drowsiness detection task. Specifically, we design a two-stage training strategy to progressively select the key contributing channels with the guidance of interpretability. We first train a teacher network in the first stage using full-head channel EEG data. Then we apply the class activation mapping (CAM) to the trained teacher model to highlight the high-contributing EEG channels and further propose a channel voting scheme to select the top N contributing EEG channels. Finally, we train a student network with the selected channels of EEG data in the second stage for driver drowsiness detection. Experiments are designed on a public dataset, and the results demonstrate that our method is highly applicable and can significantly improve the performance of cross-subject driver drowsiness detection.

CVOct 15, 2024Code
Open World Object Detection: A Survey

Yiming Li, Yi Wang, Wenqian Wang et al.

Exploring new knowledge is a fundamental human ability that can be mirrored in the development of deep neural networks, especially in the field of object detection. Open world object detection (OWOD) is an emerging area of research that adapts this principle to explore new knowledge. It focuses on recognizing and learning from objects absent from initial training sets, thereby incrementally expanding its knowledge base when new class labels are introduced. This survey paper offers a thorough review of the OWOD domain, covering essential aspects, including problem definitions, benchmark datasets, source codes, evaluation metrics, and a comparative study of existing methods. Additionally, we investigate related areas like open set recognition (OSR) and incremental learning (IL), underlining their relevance to OWOD. Finally, the paper concludes by addressing the limitations and challenges faced by current OWOD algorithms and proposes directions for future research. To our knowledge, this is the first comprehensive survey of the emerging OWOD field with over one hundred references, marking a significant step forward for object detection technology. A comprehensive source code and benchmarks are archived and concluded at https://github.com/ArminLee/OWOD Review.

CVAug 3, 2024
MultiFuser: Multimodal Fusion Transformer for Enhanced Driver Action Recognition

Ruoyu Wang, Wenqian Wang, Jianjun Gao et al.

Driver action recognition, aiming to accurately identify drivers' behaviours, is crucial for enhancing driver-vehicle interactions and ensuring driving safety. Unlike general action recognition, drivers' environments are often challenging, being gloomy and dark, and with the development of sensors, various cameras such as IR and depth cameras have emerged for analyzing drivers' behaviors. Therefore, in this paper, we propose a novel multimodal fusion transformer, named MultiFuser, which identifies cross-modal interrelations and interactions among multimodal car cabin videos and adaptively integrates different modalities for improved representations. Specifically, MultiFuser comprises layers of Bi-decomposed Modules to model spatiotemporal features, with a modality synthesizer for multimodal features integration. Each Bi-decomposed Module includes a Modal Expertise ViT block for extracting modality-specific features and a Patch-wise Adaptive Fusion block for efficient cross-modal fusion. Extensive experiments are conducted on Drive&Act dataset and the results demonstrate the efficacy of our proposed approach.

CVSep 4, 2025
A Generative Foundation Model for Chest Radiography

Yuanfeng Ji, Dan Lin, Xiyue Wang et al.

The scarcity of well-annotated diverse medical images is a major hurdle for developing reliable AI models in healthcare. Substantial technical advances have been made in generative foundation models for natural images. Here we develop `ChexGen', a generative vision-language foundation model that introduces a unified framework for text-, mask-, and bounding box-guided synthesis of chest radiographs. Built upon the latent diffusion transformer architecture, ChexGen was pretrained on the largest curated chest X-ray dataset to date, consisting of 960,000 radiograph-report pairs. ChexGen achieves accurate synthesis of radiographs through expert evaluations and quantitative metrics. We demonstrate the utility of ChexGen for training data augmentation and supervised pretraining, which led to performance improvements across disease classification, detection, and segmentation tasks using a small fraction of training data. Further, our model enables the creation of diverse patient cohorts that enhance model fairness by detecting and mitigating demographic biases. Our study supports the transformative role of generative foundation models in building more accurate, data-efficient, and equitable medical AI systems.

CRAug 28, 2025
BridgeShield: Enhancing Security for Cross-chain Bridge Applications via Heterogeneous Graph Mining

Dan Lin, Shunfeng Lu, Ziyan Liu et al.

Cross-chain bridges play a vital role in enabling blockchain interoperability. However, due to the inherent design flaws and the enormous value they hold, they have become prime targets for hacker attacks. Existing detection methods show progress yet remain limited, as they mainly address single-chain behaviors and fail to capture cross-chain semantics. To address this gap, we leverage heterogeneous graph attention networks, which are well-suited for modeling multi-typed entities and relations, to capture the complex execution semantics of cross-chain behaviors. We propose BridgeShield, a detection framework that jointly models the source chain, off-chain coordination, and destination chain within a unified heterogeneous graph representation. BridgeShield incorporates intra-meta-path attention to learn fine-grained dependencies within cross-chain paths and inter-meta-path attention to highlight discriminative cross-chain patterns, thereby enabling precise identification of attack behaviors. Extensive experiments on 51 real-world cross-chain attack events demonstrate that BridgeShield achieves an average F1-score of 92.58%, representing a 24.39% improvement over state-of-the-art baselines. These results validate the effectiveness of BridgeShield as a practical solution for securing cross-chain bridges and enhancing the resilience of multi-chain ecosystems.

CVJun 17, 2024
CM2-Net: Continual Cross-Modal Mapping Network for Driver Action Recognition

Ruoyu Wang, Chen Cai, Wenqian Wang et al.

Driver action recognition has significantly advanced in enhancing driver-vehicle interactions and ensuring driving safety by integrating multiple modalities, such as infrared and depth. Nevertheless, compared to RGB modality only, it is always laborious and costly to collect extensive data for all types of non-RGB modalities in car cabin environments. Therefore, previous works have suggested independently learning each non-RGB modality by fine-tuning a model pre-trained on RGB videos, but these methods are less effective in extracting informative features when faced with newly-incoming modalities due to large domain gaps. In contrast, we propose a Continual Cross-Modal Mapping Network (CM2-Net) to continually learn each newly-incoming modality with instructive prompts from the previously-learned modalities. Specifically, we have developed Accumulative Cross-modal Mapping Prompting (ACMP), to map the discriminative and informative features learned from previous modalities into the feature space of newly-incoming modalities. Then, when faced with newly-incoming modalities, these mapped features are able to provide effective prompts for which features should be extracted and prioritized. These prompts are accumulating throughout the continual learning process, thereby boosting further recognition performances. Extensive experiments conducted on the Drive&Act dataset demonstrate the performance superiority of CM2-Net on both uni- and multi-modal driver action recognition.

CVJan 26, 2024
Multi-modality action recognition based on dual feature shift in vehicle cabin monitoring

Dan Lin, Philip Hann Yung Lee, Yiming Li et al.

Driver Action Recognition (DAR) is crucial in vehicle cabin monitoring systems. In real-world applications, it is common for vehicle cabins to be equipped with cameras featuring different modalities. However, multi-modality fusion strategies for the DAR task within car cabins have rarely been studied. In this paper, we propose a novel yet efficient multi-modality driver action recognition method based on dual feature shift, named DFS. DFS first integrates complementary features across modalities by performing modality feature interaction. Meanwhile, DFS achieves the neighbour feature propagation within single modalities, by feature shifting among temporal frames. To learn common patterns and improve model efficiency, DFS shares feature extracting stages among multiple modalities. Extensive experiments have been carried out to verify the effectiveness of the proposed DFS model on the Drive\&Act dataset. The results demonstrate that DFS achieves good performance and improves the efficiency of multi-modality driver action recognition.

AIMay 17, 2023
River of No Return: Graph Percolation Embeddings for Efficient Knowledge Graph Reasoning

Kai Wang, Siqiang Luo, Dan Lin

We study Graph Neural Networks (GNNs)-based embedding techniques for knowledge graph (KG) reasoning. For the first time, we link the path redundancy issue in the state-of-the-art KG reasoning models based on path encoding and message passing to the transformation error in model training, which brings us new theoretical insights into KG reasoning, as well as high efficacy in practice. On the theoretical side, we analyze the entropy of transformation error in KG paths and point out query-specific redundant paths causing entropy increases. These findings guide us to maintain the shortest paths and remove redundant paths for minimized-entropy message passing. To achieve this goal, on the practical side, we propose an efficient Graph Percolation Process motivated by the percolation model in Fluid Mechanics, and design a lightweight GNN-based KG reasoning framework called Graph Percolation Embeddings (GraPE). GraPE outperforms previous state-of-the-art methods in both transductive and inductive reasoning tasks while requiring fewer training parameters and less inference time.

AIMar 27, 2021
Hyperbolic Geometry is Not Necessary: Lightweight Euclidean-Based Models for Low-Dimensional Knowledge Graph Embeddings

Kai Wang, Yu Liu, Dan Lin et al.

Recent knowledge graph embedding (KGE) models based on hyperbolic geometry have shown great potential in a low-dimensional embedding space. However, the necessity of hyperbolic space in KGE is still questionable, because the calculation based on hyperbolic geometry is much more complicated than Euclidean operations. In this paper, based on the state-of-the-art hyperbolic-based model RotH, we develop two lightweight Euclidean-based models, called RotL and Rot2L. The RotL model simplifies the hyperbolic operations while keeping the flexible normalization effect. Utilizing a novel two-layer stacked transformation and based on RotL, the Rot2L model obtains an improved representation capability, yet costs fewer parameters and calculations than RotH. The experiments on link prediction show that Rot2L achieves the state-of-the-art performance on two widely-used datasets in low-dimensional knowledge graph embeddings. Furthermore, RotL achieves similar performance as RotH but only requires half of the training time.

CRMar 19, 2021
Fight Virus Like a Virus: A New Defense Method Against File-Encrypting Ransomware

Joshua Morris, Dan Lin, Marcellus Smith

Nowadays ransomware has become a new profitable form of attack. This type of malware acts as a form of extortion which encrypts the files in a victim's computer and forces the victim to pay the ransom to have the data recovered. Even companies and tech savvy people must use extensive resources to maintain backups for recovery or else they will lose valuable data, not mentioning average users. Unfortunately, not any recovery tool can effectively defend various types of ransomware. To address this challenge, we propose a novel ransomware defense mechanism that can be easily deployed in modern Windows system to recover the data and mitigate a ransomware attack. The uniqueness of our approach is to fight the virus like a virus. We leverage Alternative Data Streams which are sometimes used by malicious applications, to develop a data protection method that misleads the ransomware to attack only file 'shells' instead of the actual file content. We evaluated different file encrypting ransomware and demonstrate usability, efficiency and effectiveness of our approach.

CVSep 5, 2019
PRSNet: Part Relation and Selection Network for Bone Age Assessment

Yuanfeng Ji, Hao Chen, Dan Lin et al.

Bone age is one of the most important indicators for assessing bone's maturity, which can help to interpret human's growth development level and potential progress. In the clinical practice, bone age assessment (BAA) of X-ray images requires the joint consideration of the appearance and location information of hand bones. These kinds of information can be effectively captured by the relation of different anatomical parts of hand bone. Recently developed methods differ mostly in how they model the part relation and choose useful parts for BAA. However, these methods neglect the mining of relationship among different parts, which can help to improve the assessment accuracy. In this paper, we propose a novel part relation module, which accurately discovers the underlying concurrency of parts by using multi-scale context information of deep learning feature representation. Furthermore, based on the part relation, we explore a new part selection module, which comprehensively measures the importance of parts and select the top ranking parts for assisting BAA. We jointly train our part relation and selection modules in an end-to-end way, achieving state-of-the-art performance on the public RSNA 2017 Pediatric Bone Age benchmark dataset and outperforming other competitive methods by a significant margin.

CRJan 29, 2019
Hiding in the Clouds and Building a Stealth Communication Network

Wei Jiang, Adam Bowers, Dan Lin

Social networks, instant messages and file sharing systems are common communication means among friends, families, coworkers, etc. Due to concerns of personal privacy, identify thefts, data misuse, freedom of speech and government surveillance, online social or communication networks have provided various options for a user to guard or control his or her personal data. However, for most these services, user data are still accessible by the service providers, which can lead to both liability issues if data breach occurs at the server side and data misuse by the network administrators. To prevent service providers from accessing user data, secure end-to-end user communication is a must, like the one provided by WhatsApp. On the other hand, the services provided by such communication network can still be interfered by an authority. For a communication network to be stealthy, the following features are essential: (1) oblivious service, (2) complete user control and flexibility, and (3) lightweight. In this paper, we first discuss the features and benefits of a stealth communication network (SNET), and then we propose a theoretical framework that can be adopted to implement an SNET. By utilizing the framework and the existing publicly available cloud storage, we present the implementation details of an instance of SNET, named Secret-Share. Last but not least, we discuss the current limitations of Secret-Share and its potential extensions.

CLAug 11, 2018
Knowledge Graph Embedding with Entity Neighbors and Deep Memory Network

Kai Wang, Yu Liu, Xiujuan Xu et al.

Knowledge Graph Embedding (KGE) aims to represent entities and relations of knowledge graph in a low-dimensional continuous vector space. Recent works focus on incorporating structural knowledge with additional information, such as entity descriptions, relation paths and so on. However, common used additional information usually contains plenty of noise, which makes it hard to learn valuable representation. In this paper, we propose a new kind of additional information, called entity neighbors, which contain both semantic and topological features about given entity. We then develop a deep memory network model to encode information from neighbors. Employing a gating mechanism, representations of structure and neighbors are integrated into a joint representation. The experimental results show that our model outperforms existing KGE methods utilizing entity descriptions and achieves state-of-the-art metrics on 4 datasets.