SECRLGPLMay 7, 2021

Code2Image: Intelligent Code Analysis by Computer Vision Techniques and Application to Vulnerability Prediction

arXiv:2105.03131v18 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the problem of intelligent code analysis for software developers and security researchers by providing a novel representation method, though it is incremental as it builds on existing computer vision techniques.

The paper tackles the challenge of representing source code for machine learning by converting it into images that preserve semantic and syntactic properties, enabling direct use with computer vision techniques, and demonstrates effectiveness through vulnerability prediction on a public dataset with performance comparisons to state-of-the-art solutions.

Intelligent code analysis has received increasing attention in parallel with the remarkable advances in the field of machine learning (ML) in recent years. A major challenge in leveraging ML for this purpose is to represent source code in a useful form that ML algorithms can accept as input. In this study, we present a novel method to represent source code as image while preserving semantic and syntactic properties, which paves the way for leveraging computer vision techniques to use for code analysis. Indeed the method makes it possible to directly enter the resulting image representation of source codes into deep learning (DL) algorithms as input without requiring any further data pre-processing or feature extraction step. We demonstrate feasibility and effectiveness of our method by realizing a vulnerability prediction use case over a public dataset containing a large number of real-world source code samples with performance evaluation in comparison to the state-of-art solutions. Our implementation is publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes