SEDec 5, 2018

How practical is it? Machine Learning for Identifying Conceptual Interoperability Constraints in API Documents

Hadil Abukwaik, Mohammed Abufouda, Thejashree Nair, Dieter Rombach

arXiv:1812.02096v1

Originality Synthesis-oriented

AI Analysis

This addresses a tedious and error-prone task for software architects and analysts, though it is incremental as it applies existing ML techniques to a specific domain problem.

The paper tackles the challenge of manually identifying conceptual interoperability constraints (COINs) in API documents by developing a machine learning model, achieving up to 87% accuracy in automated identification and demonstrating practical utility with practitioner acceptance.

Building meaningful interoperation with external software units requires performing the conceptual interoperability analysis that starts with identifying the conceptual interoperability constraints of each software unit, then it compares the systems' constraints to detect their conceptual mismatch. We call the conceptual interoperability constraints (the COINs) that can be of different types including structure, dynamic, and quality. Missing such constraints may lead to unexpected mismatches, expensive resolution, and running-late projects. However, it is a challenging task for software architects and analysts to manually analyze the unstructured text in API documents to identify the COINs. Not only it is a tedious and time-consuming task, but also it needs knowledge about the constraint types. In this article, we present and evaluate our idea of utilizing machine learning techniques in automating the COIN identification, which is the first step of conceptual interoperability analysis, from human text in API documents. Our empirical research started with a multiple-case study to build the ground truth dataset, on which we contributed our machine learning COIN-Classification Model. We show the model's robustness through experiments using different machine learning text-classification algorithms. The experiments' results revealed that our model can achieve up to 87% accuracy in automatically identifying the COINs in text. Thus, we implemented a tool that embeds our model to demonstrate its practical value in industrial context. Then, we evaluated the practitioners' acceptance for the tool and found that they significantly agreed on its usefulness and ease of use.

View on arXiv PDF

Similar