SEMar 6, 2021
On the experiences of adopting automated data validation in an industrial machine learning projectLucy Ellen Lwakatare, Ellinor Rånge, Ivica Crnkovic et al.
Background: Data errors are a common challenge in machine learning (ML) projects and generally cause significant performance degradation in ML-enabled software systems. To ensure early detection of erroneous data and avoid training ML models using bad data, research and industrial practice suggest incorporating a data validation process and tool in ML system development process. Aim: The study investigates the adoption of a data validation process and tool in industrial ML projects. The data validation process demands significant engineering resources for tool development and maintenance. Thus, it is important to identify the best practices for their adoption especially by development teams that are in the early phases of deploying ML-enabled software systems. Method: Action research was conducted at a large-software intensive organization in telecommunications, specifically within the analytics R\&D organization for an ML use case of classifying faults from returned hardware telecommunication devices. Results: Based on the evaluation results and learning from our action research, we identified three best practices, three benefits, and two barriers to adopting the data validation process and tool in ML projects. We also propose a data validation framework (DVF) for systematizing the adoption of a data validation process. Conclusions: The results show that adopting a data validation process and tool in ML projects is an effective approach of testing ML-enabled software systems. It requires having an overview of the level of data (feature, dataset, cross-dataset, data stream) at which certain data quality tests can be applied.
SEDec 1, 2020
HPM-Frame: A Decision Framework for Executing Software on Heterogeneous PlatformsHugo Andrade, Ola Benderius, Christian Berger et al.
Heterogeneous computing is one of the most important computational solutions to meet rapidly increasing demands on system performance. It typically allows the main flow of applications to be executed on a CPU while the most computationally intensive tasks are assigned to one or more accelerators, such as GPUs and FPGAs. The refactoring of systems for execution on such platforms is highly desired but also difficult to perform, mainly due the inherent increase in software complexity. After exploration, we have identified a current need for a systematic approach that supports engineers in the refactoring process -- from CPU-centric applications to software that is executed on heterogeneous platforms. In this paper, we introduce a decision framework that assists engineers in the task of refactoring software to incorporate heterogeneous platforms. It covers the software engineering lifecycle through five steps, consisting of questions to be answered in order to successfully address aspects that are relevant for the refactoring procedure. We evaluate the feasibility of the framework in two ways. First, we capture the practitioner's impressions, concerns and suggestions through a questionnaire. Then, we conduct a case study showing the step-by-step application of the framework using a computer vision application in the automotive domain.
SEMay 18, 2020
Refactoring Software in the Automotive Domain for Execution on Heterogeneous PlatformsHugo Andrade, Ivica Crnkovic, Jan Bosch
The most important way to achieve higher performance in computer systems is through heterogeneous computing, i.e., by adopting hardware platforms containing more than one type of processor, such as CPUs, GPUs, and FPGAs. Several types of algorithms can be executed significantly faster on a heterogeneous platform. However, migrating CPU-executable software to other types of execution platforms poses a number of challenges to software engineering. Significant efforts are required in such type of migration, particularly for re-architecting and re-implementing the software. Further, optimizing it in terms of performance and other runtime properties can be very challenging, making the process complex, expensive, and error-prone. Therefore, a systematic approach based on explicit and justified architectural decisions is needed for a successful refactoring process from a homogeneous to a heterogeneous platform. In this paper, we propose a decision framework that supports engineers when refactoring software systems to accommodate heterogeneous platforms. It includes the assessment of important factors in order to minimize the risk of recurrent problems in the process. Through a set of questions, practitioners are able to formulate answers that will help in making appropriate architectural decisions to accommodate heterogeneous platforms. The contents of the framework have been developed and evolved based on discussions with architects and developers in the automotive domain.
LGJan 16, 2020
Engineering AI Systems: A Research AgendaJan Bosch, Ivica Crnkovic, Helena Holmström Olsson
Artificial intelligence (AI) and machine learning (ML) are increasingly broadly adopted in industry, However, based on well over a dozen case studies, we have learned that deploying industry-strength, production quality ML models in systems proves to be challenging. Companies experience challenges related to data quality, design methods and processes, performance of models as well as deployment and compliance. We learned that a new, structured engineering approach is required to construct and evolve systems that contain ML/DL components. In this paper, we provide a conceptualization of the typical evolution patterns that companies experience when employing ML as well as an overview of the key problems experienced by the companies that we have studied. The main contribution of the paper is a research agenda for AI engineering that provides an overview of the key engineering challenges surrounding ML solutions and an overview of open items that need to be addressed by the research community at large.
SEMay 5, 2019
A Review on Software Architectures for Heterogeneous PlatformsHugo Andrade, Ivica Crnkovic
The increasing demands for computing performance have been a reality regardless of the requirements for smaller and more energy efficient devices. Throughout the years, the strategy adopted by industry was to increase the robustness of a single processor by increasing its clock frequency and mounting more transistors so more calculations could be executed. However, it is known that the physical limits of such processors are being reached, and one way to fulfill such increasing computing demands has been to adopt a strategy based on heterogeneous computing, i.e., using a heterogeneous platform containing more than one type of processor. This way, different types of tasks can be executed by processors that are specialized in them. Heterogeneous computing, however, poses a number of challenges to software engineering, especially in the architecture and deployment phases. In this paper, we conduct an empirical study that aims at discovering the state-of-the-art in software architecture for heterogeneous computing, with focus on deployment. We conduct a systematic mapping study that retrieved 28 studies, which were critically assessed to obtain an overview of the research field. We identified gaps and trends that can be used by both researchers and practitioners as guides to further investigate the topic.