Blockchain Large Language Models
This addresses the need for more effective anomaly detection in blockchain transactions for security applications, though it is incremental as it adapts existing transformer methods to a new domain.
The paper tackles the problem of detecting anomalous blockchain transactions by introducing BlockGPT, a large language model trained from scratch as a real-time intrusion detection system for Ethereum, which identifies 49 out of 124 attacks among the top-3 most abnormal transactions in a dataset of 68M transactions with a throughput of 2284 transactions per second.
This paper presents a dynamic, real-time approach to detecting anomalous blockchain transactions. The proposed tool, BlockGPT, generates tracing representations of blockchain activity and trains from scratch a large language model to act as a real-time Intrusion Detection System. Unlike traditional methods, BlockGPT is designed to offer an unrestricted search space and does not rely on predefined rules or patterns, enabling it to detect a broader range of anomalies. We demonstrate the effectiveness of BlockGPT through its use as an anomaly detection tool for Ethereum transactions. In our experiments, it effectively identifies abnormal transactions among a dataset of 68M transactions and has a batched throughput of 2284 transactions per second on average. Our results show that, BlockGPT identifies abnormal transactions by ranking 49 out of 124 attacks among the top-3 most abnormal transactions interacting with their victim contracts. This work makes contributions to the field of blockchain transaction analysis by introducing a custom data encoding compatible with the transformer architecture, a domain-specific tokenization technique, and a tree encoding method specifically crafted for the Ethereum Virtual Machine (EVM) trace representation.