ByCAN: Reverse Engineering Controller Area Network (CAN) Messages from Bit to Byte Level
This work addresses a critical bottleneck in automotive cybersecurity and autonomous applications by enabling automated decoding of CAN messages without prior knowledge, though it is incremental as it builds on existing reverse-engineering methods.
The paper tackles the problem of reverse engineering CAN messages in automotive systems, which are proprietary and lack decoding specifications, by proposing ByCAN, an automated system that achieves slicing accuracy of 80.21%, slicing coverage of 95.21%, and labeling accuracy of 68.72% on real-world data.
As the primary standard protocol for modern cars, the Controller Area Network (CAN) is a critical research target for automotive cybersecurity threats and autonomous applications. As the decoding specification of CAN is a proprietary black-box maintained by Original Equipment Manufacturers (OEMs), conducting related research and industry developments can be challenging without a comprehensive understanding of the meaning of CAN messages. In this paper, we propose a fully automated reverse-engineering system, named ByCAN, to reverse engineer CAN messages. ByCAN outperforms existing research by introducing byte-level clusters and integrating multiple features at both byte and bit levels. ByCAN employs the clustering and template matching algorithms to automatically decode the specifications of CAN frames without the need for prior knowledge. Experimental results demonstrate that ByCAN achieves high accuracy in slicing and labeling performance, i.e., the identification of CAN signal boundaries and labels. In the experiments, ByCAN achieves slicing accuracy of 80.21%, slicing coverage of 95.21%, and labeling accuracy of 68.72% for general labels when analyzing the real-world CAN frames.