SEDec 6, 2016

Automated Inference of Software Library Usage Patterns

Mohamed Aymen Saied, Ali Ouni, Houari Sahraoui, Raula Gaikovina Kula, Katsuro Inoue, David Lo

arXiv:1612.01626v19.75 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge for software developers in efficiently leveraging reuse opportunities from third-party libraries, though it is incremental as it builds on existing clustering techniques.

The paper tackles the problem of identifying third-party library usage patterns in software development by presenting a novel approach using hierarchical clustering on client usage data, achieving detection of 77% of highly consistent and cohesive patterns across over 38,000 client systems.

Modern software systems are increasingly dependent on third-party libraries. It is widely recognized that using mature and well-tested third-party libraries can improve developers' productivity, reduce time-to-market, and produce more reliable software. Today's open-source repositories provide a wide range of libraries that can be freely downloaded and used. However, as software libraries are documented separately but intended to be used together, developers are unlikely to fully take advantage of these reuse opportunities. In this paper, we present a novel approach to automatically identify third-party library usage patterns, i.e., collections of libraries that are commonly used together by developers. Our approach employs hierarchical clustering technique to group together software libraries based on external client usage. To evaluate our approach, we mined a large set of over 6,000 popular libraries from Maven Central Repository and investigated their usage by over 38,000 client systems from the Github repository. Our experiments show that our technique is able to detect the majority (77%) of highly consistent and cohesive library usage patterns across a considerable number of client systems.

View on arXiv PDF

Similar