Guenther Ruhe

SE
16papers
440citations
Novelty30%
AI Score21

16 Papers

SESep 28, 2021
How Much Data Analytics is Enough? The ROI of Machine Learning Classification and its Application to Requirements Dependency Classification

Gouri Deshpande, Guenther Ruhe, Chad Saunders

Machine Learning (ML) can substantially improve the efficiency and effectiveness of organizations and is widely used for different purposes within Software Engineering. However, the selection and implementation of ML techniques rely almost exclusively on accuracy criteria. Thus, for organizations wishing to realize the benefits of ML investments, this narrow approach ignores crucial considerations around the anticipated costs of the ML activities across the ML lifecycle, while failing to account for the benefits that are likely to accrue from the proposed activity. We present findings for an approach that addresses this gap by enhancing the accuracy criterion with return on investment (ROI) considerations. Specifically, we analyze the performance of the two state-of-the-art ML techniques: Random Forest and Bidirectional Encoder Representations from Transformers (BERT), based on accuracy and ROI for two publicly available data sets. Specifically, we compare decision-making on requirements dependency extraction (i) exclusively based on accuracy and (ii) extended to include ROI analysis. As a result, we propose recommendations for selecting ML classification techniques based on the degree of training data used. Our findings indicate that considering ROI as additional criteria can drastically influence ML selection when compared to decisions based on accuracy as the sole criterion

SEJul 5, 2021
An Evolutionary Algorithm for Task Scheduling in Crowdsourced Software Development

Razieh Saremi, Hardik Yagnik, Julian Togelius et al.

The complexity of software tasks and the uncertainty of crowd developer behaviors make it challenging to plan crowdsourced software development (CSD) projects. In a competitive crowdsourcing marketplace, competition for shared worker resources from multiple simultaneously open tasks adds another layer of uncertainty to the potential outcomes of software crowdsourcing. These factors lead to the need for supporting CSD managers with automated scheduling to improve the visibility and predictability of crowdsourcing processes and outcomes. To that end, this paper proposes an evolutionary algorithm-based task scheduling method for crowdsourced software development. The proposed evolutionary scheduling method uses a multiobjective genetic algorithm to recommend an optimal task start date. The method uses three fitness functions, based on project duration, task similarity, and task failure prediction, respectively. The task failure fitness function uses a neural network to predict the probability of task failure with respect to a specific task start date. The proposed method then recommends the best tasks start dates for the project as a whole and each individual task so as to achieve the lowest project failure ratio. Experimental results on 4 projects demonstrate that the proposed method has the potential to reduce project duration by a factor of 33-78%.

SEMar 17, 2021
CrowdSim: A Hybrid Simulation Model for Failure Prediction in Crowdsourced Software Development

Razieh Saremi, Ye Yang, Gregg Vesonder et al.

A typical crowdsourcing software development(CSD) marketplace consists of a list of software tasks as service demands and a pool of freelancer developers as service suppliers. Highly dynamic and competitive CSD market places may result in task failure due to unforeseen risks, such as increased competition over shared worker supply, or uncertainty associated with workers' experience and skills, and so on. To improve CSD effectiveness, it is essential to better understand and plan with respect to dynamic worker characteristics and risks associated with CSD processes. In this paper, we present a hybrid simulation model, CrowdSim, to forecast crowdsourcing task failure risk in competitive CSD platforms. CrowdSim is composed of three layered components: the macro-level reflects the overall crowdsourcing platform based on system dynamics,the meso-level represents the task life cycle based on discrete event simulation, and the micro-level models the crowd workers' decision-making processes based on agent-based simulation. CrowdSim is evaluated through three CSD decision scenarios to demonstrate its effectiveness, using a real-world historical dataset and the results demonstrate CrowdSim's potential in empowering crowdsourcing managers to explore crowdsourcing outcomes with respect to different task scheduling options.

SEOct 31, 2020
Stakeholder identification for a structured release planning approach in the automotive domain

Kristina Marner, Stefan Wagner, Guenther Ruhe

Context: In regulated domains like automotive, release planning is a complex process. The agreement between traditional product development processes for hardware as well as mechanic systems and agile development approaches for software development is a major challenge. Especially the creation and synchronization of a release plan is challenging. Objective: The aim of this work is to present identified stakeholders of a release plan as an appropriate approach to create transparency in release planning in the automotive domain. Method: Action research to elaborate relevant stakeholders for release planning was conducted at Dr. Ing. h. c. F. Porsche AG. Results: We present a detailed overview of identified stakeholders due to release planning as well as their required content and added value regarding to two pilot projects. The results confirm the fact that almost every stakeholder is involved in a release plan in a certain way. Conclusions: Release planning within a complex project environment and complicated customer constellations is difficult to manage. We discuss how the presented stakeholders could meet with the given conditions in the automotive domain. With this contribution, identified stakeholders of release planning from hardware and software point of view is introduced. It helps to reach transparency and to handle the given complexity.

LGSep 14, 2020
Beyond Accuracy: ROI-driven Data Analytics of Empirical Data

Gouri Deshpande, Guenther Ruhe

This vision paper demonstrates that it is crucial to consider Return-on-Investment (ROI) when performing Data Analytics. Decisions on "How much analytics is needed"? are hard to answer. ROI could guide for decision support on the What?, How?, and How Much? analytics for a given problem. Method: The proposed conceptual framework is validated through two empirical studies that focus on requirements dependencies extraction in the Mozilla Firefox project. The two case studies are (i) Evaluation of fine-tuned BERT against Naive Bayes and Random Forest machine learners for binary dependency classification and (ii) Active Learning against passive Learning (random sampling) for REQUIRES dependency extraction. For both the cases, their analysis investment (cost) is estimated, and the achievable benefit from DA is predicted, to determine a break-even point of the investigation. Results: For the first study, fine-tuned BERT performed superior to the Random Forest, provided that more than 40% of training data is available. For the second, Active Learning achieved higher F1 accuracy within fewer iterations and higher ROI compared to Baseline (Random sampling based RF classifier). In both the studies, estimate on, How much analysis likely would pay off for the invested efforts?, was indicated by the break-even point. Conclusions: Decisions for the depth and breadth of DA of empirical data should not be made solely based on the accuracy measures. Since ROI-driven Data Analytics provides a simple yet effective direction to discover when to stop further investigation while considering the cost and value of the various types of analysis, it helps to avoid over-analyzing empirical data.

SEDec 4, 2019
Optimization in Software Engineering -- A Pragmatic Approach

Guenther Ruhe

Empirical software engineering is concerned with the design and analysis of empirical studies that include software products, processes, and resources. Optimization is a form of data analytics in support of human decision-making. Optimization methods are aimed to find the best decision alternatives. Empirical studies serve both as a model and as data input for optimization. In addition, the complexity of the models used for optimization trigger further studies on explaining and validating the results in real-world scenarios. The goal of this chapter is to give an overview of the as-is and of the to-be usage of optimization in software engineering. The emphasis is on pragmatic use of optimization, and not so much on describing the most recent algorithmic innovations and tool developments. The usage of optimization covers a wide range of questions from different types of software engineering problems along the whole life-cycle. To facilitate its more comprehensive and more effective usage, a checklist for a guided process is described. The chapter uses a running example Asymmetric Release Planning to illustrate the whole process. A Return-on-Investment analysis is proposed as part of the problem scoping. This helps to decide on the depth and breadth of analysis in relation to the effort needed to run the analysis and the projected value of the solution.

SEOct 20, 2019
Release Practices for Mobile Apps--What do Users and Developers Think?

Maleknaz Nayebi, Bram Adams, Guenther Ruhe

Large software organizations such as Facebook or Netflix, who otherwise make daily or even hourly releases of their web applications using continuous delivery, have had to invest heavily into a customized release strategy for their mobile apps, because the vetting process of app stores introduces lag and uncertainty into the release process. Amidst these large, resourceful organizations, it is unknown how the average mobile app developer organizes her app's releases, even though an incorrect strategy might bring a premature app update to the market that drives away customers towards the heavy market competition. To understand the common release strategies used for mobile apps, the rationale behind them and their perceived impact on users, we performed two surveys with users and developers. We found that half of the developers have a clear strategy for their mobile app releases, since especially the more experienced developers believe that it affects user feedback. We also found that users are aware of new app updates, yet only half of the surveyed users enables automatic updating of apps. While the release date and frequency is not a decisive factor to install an app, users prefer to install apps that were updated more recently and less frequently. Our study suggests that an app's release strategy is a factor that affects the ongoing success of mobile apps.

SEJan 17, 2019
Mining Treatment-Outcome Constructs from Sequential Software Engineering Data

Maleknaz Nayebi, Guenther Ruhe, Thomas Zimmermann

Many investigations in empirical software engineering look at sequences of data resulting from development or management processes. In this paper, we propose an analytical approach called the Gandhi-Washington Method (GWM) to investigate the impact of recurring events in software projects. GWM takes an encoding of events and activities provided by a software analyst as input. It uses regular expressions to automatically condense and summarize information and infer treatments. Relating the treatments to the outcome through statistical tests, treatment-outcome constructs are automatically mined from the data. The output of GWM is a set of treatment-outcome constructs. Each treatment in the set of mined constructs is significantly different from the other treatments considering the impact on the outcome and/or is structurally different from other treatments considering the sequence of events. We describe GWM and classes of problems to which GWM can be applied. We demonstrate the applicability of this method for empirical studies on sequences of file editing, code ownership, and release cycle time.

SEJan 16, 2019
Asymmetric Release Planning-Compromising Satisfaction against Dissatisfaction

Maleknaz Nayebi, Guenther Ruhe

Maximizing satisfaction from offering features as part of the upcoming release(s) is different from minimizing dissatisfaction gained from not offering features. This asymmetric behavior has never been utilized for product release planning. We study Asymmetric Release Planning (ARP) by accommodating asymmetric feature evaluation. We formulated and solved ARP as a bi-criteria optimization problem. In its essence, it is the search for optimized trade-offs between maximum stakeholder satisfaction and minimum dissatisfaction. Different techniques including a continuous variant of Kano analysis are available to predict the impact on satisfaction and dissatisfaction with a product release from offering or not offering a feature. As a proof of concept, we validated the proposed solution approach called Satisfaction-Dissatisfaction Optimizer (SDO) via a real-world case study project. From running three replications with varying effort capacities, we demonstrate that SDO generates optimized trade-off solutions being (i) of a different value profile and different structure, (ii) superior to the application of random search and heuristics in terms of quality and completeness, and (iii) superior to the usage of manually generated solutions generated from managers of the case study company. A survey with 20 stakeholders evaluated the applicability and usefulness of the generated results.

SENov 30, 2018
A Longitudinal Study of Identifying and Paying Down Architectural Debt

Maleknaz Nayebi, Yuanfang Cai, Rick Kazman et al.

Architectural debt is a form of technical debt that derives from the gap between the architectural design of the system as it "should be" compared to "as it is". We measured architecture debt in two ways: 1) in terms of system-wide coupling measures, and 2) in terms of the number and severity of architectural flaws. In recent work it was shown that the amount of architectural debt has a huge impact on software maintainability and evolution. Consequently, detecting and reducing the debt is expected to make software more amenable to change. This paper reports on a longitudinal study of a healthcare communications product created by Brightsquid Secure Communications Corp. This start-up company is facing the typical trade-off problem of desiring responsiveness to change requests, but wanting to avoid the ever-increasing effort that the accumulation of quick-and-dirty changes eventually incurs. In the first stage of the study, we analyzed the status of the "before" system, which indicated the impacts of change requests. This initial study motivated a more in-depth analysis of architectural debt. The results of this analysis were used to motivate a comprehensive refactoring of the software system. The third phase of the study was a follow-on architectural debt analysis which quantified the improvements made. Using this quantitative evidence, augmented by qualitative evidence gathered from in-depth interviews with Brightsquid's architects, we present lessons learned about the costs and benefits of paying down architecture debt in practice.

SEAug 11, 2018
ESSMArT Way to Manage User Requests

Maleknaz Nayebi, Liam Dicke, Ron Ittyipe et al.

Quality and market acceptance of software products is strongly influenced by responsiveness to user requests. Once a request is received from a customer, decisions need to be made if the request should be escalated to the development team. Once escalated, the ticket must be formulated as a development task and be assigned to a developer. To make the process more efficient and reduce the time between receiving and escalating the user request, we aim to automate of the complete user request management process. We propose a holistic method called ESSMArT. The methods performs text summarization, predicts ticket escalation, creates the title and content of the ticket used by developers, and assigns the ticket to an available developer. We internally evaluated the method by 4,114 user tickets from Brightsquid and their secure health care communication plat- form Secure-Mail. We also perform an external evaluation on the usefulness of the approach. We found that supervised learning based on context specific data performs best for extractive summarization. For predicting escalation of tickets, Random Forest trained on a combination of conversation and extractive summarization is best with highest precision (of 0.9) and recall (of 0.55). From external evaluation we found that ESSMArT provides suggestions that are 71% aligned with human ones. Applying the prototype implementation to 315 user requests resulted in an average time reduction of 9.2 minutes per request. ESSMArT helps to make ticket management faster and with reduced effort for human experts. ESSMArT can help Brightsquid to (i) minimize the impact of staff turnover and (ii) shorten the cycle from an issue being reported to an assignment to a developer to fix it.

SEMay 21, 2018
Status Quo in Requirements Engineering: A Theory and a Global Family of Surveys

Stefan Wagner, Daniel Méndez Fernández, Michael Felderer et al.

Requirements Engineering (RE) has established itself as a software engineering discipline during the past decades. While researchers have been investigating the RE discipline with a plethora of empirical studies, attempts to systematically derive an empirically-based theory in context of the RE discipline have just recently been started. However, such a theory is needed if we are to define and motivate guidance in performing high quality RE research and practice. We aim at providing an empirical and valid foundation for a theory of RE, which helps software engineers establish effective and efficient RE processes. We designed a survey instrument and theory that has now been replicated in 10 countries world-wide. We evaluate the propositions of the theory with bootstrapped confidence intervals and derive potential explanations for the propositions. We report on the underlying theory and the full results obtained from the replication studies with participants from 228 organisations. Our results represent a substantial step forward towards developing an empirically-based theory of RE giving insights into current practices with RE processes. The results reveal, for example, that there are no strong differences between organisations in different countries and regions, that interviews, facilitated meetings and prototyping are the most used elicitation techniques, that requirements are often documented textually, that traces between requirements and code or design documents is common, requirements specifications themselves are rarely changed and that requirements engineering (process) improvement endeavours are mostly intrinsically motivated. Our study establishes a theory that can be used as starting point for many further studies for more detailed investigations. Practitioners can use the results as theory-supported guidance on selecting suitable RE methods and techniques.

SEJul 7, 2017
What Works Better? A Study of Classifying Requirements

Zahra Shakeri Hossein Abad, Oliver Karras, Parisa Ghazi et al.

Classifying requirements into functional requirements (FR) and non-functional ones (NFR) is an important task in requirements engineering. However, automated classification of requirements written in natural language is not straightforward, due to the variability of natural language and the absence of a controlled vocabulary. This paper investigates how automated classification of requirements into FR and NFR can be improved and how well several machine learning approaches work in this context. We contribute an approach for preprocessing requirements that standardizes and normalizes requirements before applying classification algorithms. Further, we report on how well several existing machine learning methods perform for automated classification of NFRs into sub-categories such as usability, availability, or performance. Our study is performed on 625 requirements provided by the OpenScience tera-PROMISE repository. We found that our preprocessing improved the performance of an existing classification method. We further found significant differences in the performance of approaches such as Latent Dirichlet Allocation, Biterm Topic Modeling, or Naive Bayes for the sub-classification of NFRs.

SEJul 6, 2017
A Visual Narrative Path from Switching to Resuming a Requirements Engineering Task

Zahra Shakeri Hossein Abad, Alex Shymka, Jenny Le et al.

Requirements Engineering (RE) is closely tied to other development activities and is at the heart and foundation of every software development process. This makes RE the most data and communication-intensive activity compared to other development tasks. The highly demanding communication makes task switching and interruptions inevitable in RE activities. While task switching often allows us to perform tasks effectively, it imposes a cognitive load and can be detrimental to the primary task, particularly in complex tasks as the ones typical for RE activities. Visualization mechanisms enhanced with analytical methods and interaction techniques help software developers obtain a better cognitive understanding of the complexity of RE decisions, leading to timelier and higher quality decisions. In this paper, we propose to apply interactive visual analytics techniques for managing requirements decisions from various perspectives, including stakeholders communication, RE task switching, and interruptions. We propose a new layered visualization framework that supports the analytical reasoning process of task switching. This framework consists of both data analysis and visualization layers. The visual layers offer interactive knowledge visualization components for managing task interruption decisions at different stages of an interruption (i.e. before, during, and after). The analytical layers provide narrative knowledge about the consequences of task switching decisions and help requirements engineers to recall their reasoning process and decisions upon resuming a task. Moreover, we surveyed 53 software developers to test our visual prototype and to explore more required features for the visual and analytical layers of our framework.

SEJul 4, 2017
Task Interruptions in Requirements Engineering: Reality versus Perceptions!

Zahra Shakeri Hossein Abad, Guenther Ruhe, Mike Bauer

Task switching and interruptions are a daily reality in software development projects: developers switch between Requirements Engineering (RE), coding, testing, daily meetings, and other tasks. Task switching may increase productivity through increased information flow and effective time management. However, it might also cause a cognitive load to reorient the primary task, which accounts for the decrease in developers' productivity and increases in errors. This cognitive load is even greater in cases of cognitively demanding tasks as the ones typical for RE activities. In this paper, to compare the reality of task switching in RE with the perception of developers, we conducted two studies: (i) a case study analysis on 5,076 recorded tasks of 19 developers and (ii) a survey of 25 developers. The results of our retrospective analysis show that in ALL of the cases that the disruptiveness of RE interruptions is statistically different from other software development tasks, RE related tasks are more vulnerable to interruptions compared to other task types. Moreover, we found that context switching, the priority of the interrupting task, and the interruption source and timing are key factors that impact RE interruptions. We also provided a set of RE task switching patterns along with recommendations for both practitioners and researchers. While the results of our retrospective analysis show that self-interruptions are more disruptive than external interruptions, developers have different perceptions about the disruptiveness of various sources of interruptions.

SEOct 13, 2016
Decision Support for Increasing the Efficiency of Crowdsourced Software Development

Muhammad Rezaul Karim, David Messinger, Ye Yang et al.

Crowdsourced software development (CSD) offers a series of specified tasks to a large crowd of trustworthy software workers. Topcoder is a leading platform to manage the whole process of CSD. While increasingly accepted as a realistic option for software development, preliminary analysis on Topcoder's software crowd worker behaviors reveals an alarming task-quitting rate of 82.9%. In addition, a substantial number of tasks do not receive any successful submission. In this paper, we report about a methodology to improve the efficiency of CSD. We apply massive data analytics and machine leaning to (i) perform comparative analysis on alternative technique analysis to predict likelihood of winners and quitters for each task, (ii) significantly reduce the amount of non-succeeding development effort in registered but inappropriate tasks, (iii) identify and rank the most qualified registered workers for each task, and (iv) provide reliable prediction of tasks risky to get any successful submission. Our results and analysis show that Random Forest (RF) based predictive technique performs best among the alternative techniques studied. Applying RF, the tasks recommended to workers can reduce the amount of non-succeeding development effort to a great extent. On average, over a period of 30 days, the savings are 3.5 and 4.6 person-days per registered tasks for experienced resp. unexperienced workers. For the task-related recommendations of workers, we can accurately recommend at least 1 actual winner in the top ranked workers, particularly 94.07% of the time among the top-2 recommended workers for each task. Finally, we can predict, with more than 80% F-measure, the tasks likely not getting any submission, thus triggering timely corrective actions from CSD platforms or task requesters.