CRMar 1, 2017

Automatic Library Version Identification, an Exploration of Techniques

arXiv:1703.00298v13 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This addresses the problem of software analysis and security for developers and analysts, but it is incremental as it applies existing techniques to a specific domain.

The paper tackled the problem of identifying library versions in binary executables by implementing six comparison techniques in an open-source tool, finding that readable string-based techniques performed best and correctly identified multiple libraries in a stripped statically linked executable.

This paper is the result of a two month research internship on the topic of library version identification. In this paper, ideas and techniques from literature in the area of binary comparison and fingerprinting are outlined and applied to the problem of (version) identification of shared libraries and of libraries within statically linked binary executables. Six comparison techniques are chosen and implemented in an open-source tool which in turn makes use of the open-source radare2 framework for signature generation. The effectiveness of the techniques is empirically analyzed by comparing both artificial and real sample files against a reference dataset of multiple versions of dozens of libraries. The results show that out of these techniques, readable string--based techniques perform the best and that one of these techniques correctly identifies multiple libraries contained in a stripped statically linked executable file.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes