Andreas Vogelsang

SE
h-index36
34papers
1,250citations
Novelty31%
AI Score47

34 Papers

HCJan 5, 2023
On the Forces of Driver Distraction: Explainable Predictions for the Visual Demand of In-Vehicle Touchscreen Interactions

Patrick Ebel, Christoph Lingenfelder, Andreas Vogelsang

With modern infotainment systems, drivers are increasingly tempted to engage in secondary tasks while driving. Since distracted driving is already one of the main causes of fatal accidents, in-vehicle touchscreen Human-Machine Interfaces (HMIs) must be as little distracting as possible. To ensure that these systems are safe to use, they undergo elaborate and expensive empirical testing, requiring fully functional prototypes. Thus, early-stage methods informing designers about the implication their design may have on driver distraction are of great value. This paper presents a machine learning method that, based on anticipated usage scenarios, predicts the visual demand of in-vehicle touchscreen interactions and provides local and global explanations of the factors influencing drivers' visual attention allocation. The approach is based on large-scale natural driving data continuously collected from production line vehicles and employs the SHapley Additive exPlanation (SHAP) method to provide explanations leveraging informed design decisions. Our approach is more accurate than related work and identifies interactions during which long glances occur with 68 % accuracy and predicts the total glance duration with a mean error of 2.4 s. Our explanations replicate the results of various recent studies and provide fast and easily accessible insights into the effect of UI elements, driving automation, and vehicle speed on driver distraction. The system can not only help designers to evaluate current designs but also help them to better anticipate and understand the implications their design decisions might have on future designs.

SEOct 30, 2025
A Research Roadmap for Augmenting Software Engineering Processes and Software Products with Generative AI

Domenico Amalfitano, Andreas Metzger, Marco Autili et al.

Generative AI (GenAI) is rapidly transforming software engineering (SE) practices, influencing how SE processes are executed, as well as how software systems are developed, operated, and evolved. This paper applies design science research to build a roadmap for GenAI-augmented SE. The process consists of three cycles that incrementally integrate multiple sources of evidence, including collaborative discussions from the FSE 2025 "Software Engineering 2030" workshop, rapid literature reviews, and external feedback sessions involving peers. McLuhan's tetrads were used as a conceptual instrument to systematically capture the transforming effects of GenAI on SE processes and software products.The resulting roadmap identifies four fundamental forms of GenAI augmentation in SE and systematically characterizes their related research challenges and opportunities. These insights are then consolidated into a set of future research directions. By grounding the roadmap in a rigorous multi-cycle process and cross-validating it among independent author teams and peers, the study provides a transparent and reproducible foundation for analyzing how GenAI affects SE processes, methods and tools, and for framing future research within this rapidly evolving area. Based on these findings, the article finally makes ten predictions for SE in the year 2030.

SEMar 26
Opportunities and Limitations of GenAI in RE: Viewpoints from Practice

Anne Hess, Andreas Vogelsang, Xavier Franch et al.

Context and motivation: With the rapid advancement of AI technologies, there is an increasing need to understand how AI can be effectively integrated into RE processes. In recent years, several studies have explored the potential and challenges of applying GenAI to support or even automate RE-related activities. Question/problem: Despite the existing body of knowledge on AI's potential for supporting RE activities, there is limited evidence on its practical applicability and limitations from an industry perspective. Principal ideas/results: To address this gap, we conducted a survey with RE practitioners in collaboration with the IREB Special Interest Group on AI & RE. In addition to describing our research methodology and survey design, we present insights from our quantitative and qualitative data analyzes. These insights include practitioners' perspectives on current usage scenarios, concerns, experiences-both positive and negative-as well as training needs related to using GenAI in requirements elicitation, analysis, specification, validation, and management. Contribution: This study provides empirical evidence on the practical use of GenAI in RE, offering insights into its benefits, challenges, and training needs. The findings inform future research and industry strategies, guiding effective AI integration and skill development for improved RE processes and results.

SEApr 12
Enhancing Understandability and Transparency of Research Software: Tracing Research to Code

Adrian Bajraktari, Andreas Vogelsang

Modern research heavily relies on software. A significant challenge researchers face is understanding the complex software used in specific research fields. We target two scenarios in this context, namely long onboarding times for newcomers and conference reviewers evaluating replication packages. We hypothesize that both scenarios can be significantly improved when there is a clear link between the paper's ideas and the code that implements them. As a time- and staff-saving approach, we propose an LLM-based automation tool that takes in a paper and the software implementing the paper, and generates a trace mapping between research ideas and their locations in code. Initial experiments have shown that the tool can generate quite useful mappings.

AIOct 3, 2025
From Facts to Foils: Designing and Evaluating Counterfactual Explanations for Smart Environments

Anna Trapp, Mersedeh Sadeghi, Andreas Vogelsang

Explainability is increasingly seen as an essential feature of rule-based smart environments. While counterfactual explanations, which describe what could have been done differently to achieve a desired outcome, are a powerful tool in eXplainable AI (XAI), no established methods exist for generating them in these rule-based domains. In this paper, we present the first formalization and implementation of counterfactual explanations tailored to this domain. It is implemented as a plugin that extends an existing explanation engine for smart environments. We conducted a user study (N=17) to evaluate our generated counterfactuals against traditional causal explanations. The results show that user preference is highly contextual: causal explanations are favored for their linguistic simplicity and in time-pressured situations, while counterfactuals are preferred for their actionable content, particularly when a user wants to resolve a problem. Our work contributes a practical framework for a new type of explanation in smart environments and provides empirical evidence to guide the choice of when each explanation type is most effective.

SEFeb 2, 2022
Automatic Creation of Acceptance Tests by Extracting Conditionals from Requirements: NLP Approach and Case Study

Jannik Fischbach, Julian Frattini, Andreas Vogelsang et al.

Acceptance testing is crucial to determine whether a system fulfills end-user requirements. However, the creation of acceptance tests is a laborious task entailing two major challenges: (1) practitioners need to determine the right set of test cases that fully covers a requirement, and (2) they need to create test cases manually due to insufficient tool support. Existing approaches for automatically deriving test cases require semi-formal or even formal notations of requirements, though unrestricted natural language is prevalent in practice. In this paper, we present our tool-supported approach CiRA (Conditionals in Requirements Artifacts) capable of creating the minimal set of required test cases from conditional statements in informal requirements. We demonstrate the feasibility of CiRA in a case study with three industry partners. In our study, out of 578 manually created test cases, 71.8 % can be generated automatically. Additionally, CiRA discovered 80 relevant test cases that were missed in manual test case design. CiRA is publicly available at www.cira.bth.se/demo/.

SEDec 15, 2021
Causality in Requirements Artifacts: Prevalence, Detection, and Impact

Julian Frattini, Jannik Fischbach, Daniel Mendez et al.

Background: Causal relations in natural language (NL) requirements convey strong, semantic information. Automatically extracting such causal information enables multiple use cases, such as test case generation, but it also requires to reliably detect causal relations in the first place. Currently, this is still a cumbersome task as causality in NL requirements is still barely understood and, thus, barely detectable. Objective: In our empirically informed research, we aim at better understanding the notion of causality and supporting the automatic extraction of causal relations in NL requirements. Method: In a first case study, we investigate 14.983 sentences from 53 requirements documents to understand the extent and form in which causality occurs. Second, we present and evaluate a tool-supported approach, called CiRA, for causality detection. We conclude with a second case study where we demonstrate the applicability of our tool and investigate the impact of causality on NL requirements. Results: The first case study shows that causality constitutes around 28% of all NL requirements sentences. We then demonstrate that our detection tool achieves a macro-F1 score of 82% on real-world data and that it outperforms related approaches with an average gain of 11.06% in macro-Recall and 11.43% in macro-Precision. Finally, our second case study corroborates the positive correlations of causality with features of NL requirements. Conclusion: The results strengthen our confidence in the eligibility of causal relations for downstream reuse, while our tool and publicly available data constitute a first step in the ongoing endeavors of utilizing causality in RE and beyond.

SESep 5, 2021
How Do Practitioners Interpret Conditionals in Requirements?

Jannik Fischbach, Julian Frattini, Daniel Mendez et al.

Context: Conditional statements like "If A and B then C" are core elements for describing software requirements. However, there are many ways to express such conditionals in natural language and also many ways how they can be interpreted. We hypothesize that conditional statements in requirements are a source of ambiguity, potentially affecting downstream activities such as test case generation negatively. Objective: Our goal is to understand how specific conditionals are interpreted by readers who work with requirements. Method: We conduct a descriptive survey with 104 RE practitioners and ask how they interpret 12 different conditional clauses. We map their interpretations to logical formulas written in Propositional (Temporal) Logic and discuss the implications. Results: The conditionals in our tested requirements were interpreted ambiguously. We found that practitioners disagree on whether an antecedent is only sufficient or also necessary for the consequent. Interestingly, the disagreement persists even when the system behavior is known to the practitioners. We also found that certain cue phrases are associated with specific interpretations. Conclusion: Conditionals in requirements are a source of ambiguity and there is not just one way to interpret them formally. This affects any analysis that builds upon formalized requirements (e.g., inconsistency checking, test-case generation). Our results may also influence guidelines for writing requirements.

SESep 5, 2021
Semi-Automated Labeling of Requirement Datasets for Relation Extraction

Jeremias Bohn, Jannik Fischbach, Martin Schmitt et al.

Creating datasets manually by human annotators is a laborious task that can lead to biased and inhomogeneous labels. We propose a flexible, semi-automatic framework for labeling data for relation extraction. Furthermore, we provide a dataset of preprocessed sentences from the requirements engineering domain, including a set of automatically created as well as hand-crafted labels. In our case study, we compare the human and automatic labels and show that there is a substantial overlap between both annotations.

HCAug 30, 2021
Measuring Interaction-based Secondary Task Load: A Large-Scale Approach using Real-World Driving Data

Patrick Ebel, Christoph Lingenfelder, Andreas Vogelsang

Center touchscreens are the main HMI (Human-Machine Interface) between the driver and the vehicle. They are becoming, larger, increasingly complex and replace functions that could previously be controlled using haptic interfaces. To ensure that touchscreen HMI can be operated safely, they are subject to strict regulations and elaborate test protocols. Those methods and user trials require fully functional prototypes and are expensive and time-consuming. Therefore it is desirable to estimate the workload of specific interfaces or interaction sequences as early as possible in the development process. To address this problem, we envision a model-based approach that, based on the combination of user interactions and UI elements, can predict the secondary task load of the driver when interacting with the center screen. In this work, we present our current status, preliminary results, and our vision for a model-based system build upon large-scale natural driving data.

SEAug 12, 2021
Cases for Explainable Software Systems:Characteristics and Examples

Mersedeh Sadeghi, Verena Klös, Andreas Vogelsang

The need for systems to explain behavior to users has become more evident with the rise of complex technology like machine learning or self-adaptation. In general, the need for an explanation arises when the behavior of a system does not match the user's expectations. However, there may be several reasons for a mismatch including errors, goal conflicts, or multi-agent interference. Given the various situations, we need precise and agreed descriptions of explanation needs as well as benchmarks to align research on explainable systems. In this paper, we present a taxonomy that structures needs for an explanation according to different reasons. We focus on explanations to improve the user interaction with the system. For each leaf node in the taxonomy, we provide a scenario that describes a concrete situation in which a software system should provide an explanation. These scenarios, called explanation cases, illustrate the different demands for explanations. Our taxonomy can guide the requirements elicitation for explanation capabilities of interactive intelligent systems and our explanation cases build the basis for a common benchmark. We are convinced that both, the taxonomy and the explanation cases, help the community to align future research on explainable systems.

HCAug 3, 2021
Visualizing Event Sequence Data for User Behavior Evaluation of In-Vehicle Information Systems

Patrick Ebel, Christoph Lingenfelder, Andreas Vogelsang

With modern IVIS becoming more capable and complex than ever, their evaluation becomes increasingly difficult. The analysis of large amounts of user behavior data can help to cope with this complexity and can support UX experts in designing IVIS that serve customer needs and are safe to operate while driving. We, therefore, propose a Multi-level User Behavior Visualization Framework providing effective visualizations of user behavior data that is collected via telematics from production vehicles. Our approach visualizes user behavior data on three different levels: (1) The Task Level View aggregates event sequence data generated through touchscreen interactions to visualize user flows. (2) The Flow Level View allows comparing the individual flows based on a chosen metric. (3) The Sequence Level View provides detailed insights into touch interactions, glance, and driving behavior. Our case study proves that UX experts consider our approach a useful addition to their design process.

CLAug 2, 2021
Transfer Learning for Mining Feature Requests and Bug Reports from Tweets and App Store Reviews

Pablo Restrepo Henao, Jannik Fischbach, Dominik Spies et al.

Identifying feature requests and bug reports in user comments holds great potential for development teams. However, automated mining of RE-related information from social media and app stores is challenging since (1) about 70% of user comments contain noisy, irrelevant information, (2) the amount of user comments grows daily making manual analysis unfeasible, and (3) user comments are written in different languages. Existing approaches build on traditional machine learning (ML) and deep learning (DL), but fail to detect feature requests and bug reports with high Recall and acceptable Precision which is necessary for this task. In this paper, we investigate the potential of transfer learning (TL) for the classification of user comments. Specifically, we train both monolingual and multilingual BERT models and compare the performance with state-of-the-art methods. We found that monolingual BERT models outperform existing baseline methods in the classification of English App Reviews as well as English and Italian Tweets. However, we also observed that the application of heavyweight TL models does not necessarily lead to better performance. In fact, our multilingual BERT models perform worse than traditional ML methods.

CLJul 21, 2021
CATE: CAusality Tree Extractor from Natural Language Requirements

Noah Jadallah, Jannik Fischbach, Julian Frattini et al.

Causal relations (If A, then B) are prevalent in requirements artifacts. Automatically extracting causal relations from requirements holds great potential for various RE activities (e.g., automatic derivation of suitable test cases). However, we lack an approach capable of extracting causal relations from natural language with reasonable performance. In this paper, we present our tool CATE (CAusality Tree Extractor), which is able to parse the composition of a causal relation as a tree structure. CATE does not only provide an overview of causes and effects in a sentence, but also reveals their semantic coherence by translating the causal relation into a binary tree. We encourage fellow researchers and practitioners to use CATE at https://causalitytreeextractor.com/

CLJul 21, 2021
Fine-Grained Causality Extraction From Natural Language Requirements Using Recursive Neural Tensor Networks

Jannik Fischbach, Tobias Springer, Julian Frattini et al.

[Context:] Causal relations (e.g., If A, then B) are prevalent in functional requirements. For various applications of AI4RE, e.g., the automatic derivation of suitable test cases from requirements, automatically extracting such causal statements are a basic necessity. [Problem:] We lack an approach that is able to extract causal relations from natural language requirements in fine-grained form. Specifically, existing approaches do not consider the combinatorics between causes and effects. They also do not allow to split causes and effects into more granular text fragments (e.g., variable and condition), making the extracted relations unsuitable for automatic test case derivation. [Objective & Contributions:] We address this research gap and make the following contributions: First, we present the Causality Treebank, which is the first corpus of fully labeled binary parse trees representing the composition of 1,571 causal requirements. Second, we propose a fine-grained causality extractor based on Recursive Neural Tensor Networks. Our approach is capable of recovering the composition of causal statements written in natural language and achieves a F1 score of 74 % in the evaluation on the Causality Treebank. Third, we disclose our open data sets as well as our code to foster the discourse on the automatic extraction of causality in the RE community.

SEJul 12, 2021
Integrated and Iterative Requirements Analysis and Test Specification: A Case Study at Kostal

Carsten Wiecher, Jannik Fischbach, Joel Greenyer et al.

Currently, practitioners follow a top-down approach in automotive development projects. However, recent studies have shown that this top-down approach is not suitable for the implementation and testing of modern automotive systems. Specifically, practitioners increasingly fail to specify requirements and tests for systems with complex component interactions (e.g., e-mobility systems). In this paper, we address this research gap and propose an integrated and iterative scenario-based technique for the specification of requirements and test scenarios. Our idea is to combine both a top-down and a bottom-up integration strategy. For the top-down approach, we use a behavior-driven development (BDD) technique to drive the modeling of high-level system interactions from the user's perspective. For the bottom-up approach, we discovered that natural language processing (NLP) techniques are suited to make textual specifications of existing components accessible to our technique. To integrate both directions, we support the joint execution and automated analysis of system-level interactions and component-level behavior. We demonstrate the feasibility of our approach by conducting a case study at Kostal (Tier1 supplier). The case study corroborates, among other things, that our approach supports practitioners in improving requirements and test specifications for integrated system behavior.

SEMar 11, 2021
CiRA: A Tool for the Automatic Detection of Causal Relationships in Requirements Artifacts

Jannik Fischbach, Julian Frattini, Andreas Vogelsang

Requirements often specify the expected system behavior by using causal relations (e.g., If A, then B). Automatically extracting these relations supports, among others, two prominent RE use cases: automatic test case derivation and dependency detection between requirements. However, existing tools fail to extract causality from natural language with reasonable performance. In this paper, we present our tool CiRA (Causality detection in Requirements Artifacts), which represents a first step towards automatic causality extraction from requirements. We evaluate CiRA on a publicly available data set of 61 acceptance criteria (causal: 32; non-causal: 29) describing the functionality of the German Corona-Warn-App. We achieve a macro F_1 score of 83%, which corroborates the feasibility of our approach.

SEJan 26, 2021
Automatic Detection of Causality in Requirement Artifacts: the CiRA Approach

Jannik Fischbach, Julian Frattini, Arjen Spaans et al.

System behavior is often expressed by causal relations in requirements (e.g., If event 1, then event 2). Automatically extracting this embedded causal knowledge supports not only reasoning about requirements dependencies, but also various automated engineering tasks such as seamless derivation of test cases. However, causality extraction from natural language is still an open research challenge as existing approaches fail to extract causality with reasonable performance. We understand causality extraction from requirements as a two-step problem: First, we need to detect if requirements have causal properties or not. Second, we need to understand and extract their causal relations. At present, though, we lack knowledge about the form and complexity of causality in requirements, which is necessary to develop a suitable approach addressing these two problems. We conduct an exploratory case study with 14,983 sentences from 53 requirements documents originating from 18 different domains and shed light on the form and complexity of causality in requirements. Based on our findings, we develop a tool-supported approach for causality detection (CiRA). This constitutes a first step towards causality extraction from NL requirements. We report on a case study and the resulting tool-supported approach for causality detection in requirements. Our case study corroborates, among other things, that causality is, in fact, a widely used linguistic pattern to describe system behavior, as about a third of the analyzed sentences are causal. We further demonstrate that our tool CiRA achieves a macro-F1 score of 82 % on real word data and that it outperforms related approaches with an average gain of 11.06 % in macro-Recall and 11.43 % in macro-Precision. Finally, we disclose our open data sets as well as our tool to foster the discourse on the automatic detection of causality in the RE community.

SENov 10, 2020
How do Practitioners Perceive the Relevance of Requirements Engineering Research?

Xavier Franch, Daniel Mendez, Andreas Vogelsang et al.

The relevance of Requirements Engineering (RE) research to practitioners is vital for a long-term dissemination of research results to everyday practice. Some authors have speculated about a mismatch between research and practice in the RE discipline. However, there is not much evidence to support or refute this perception. This paper presents the results of a study aimed at gathering evidence from practitioners about their perception of the relevance of RE research and at understanding the factors that influence that perception. We conducted a questionnaire-based survey of industry practitioners with expertise in RE. The participants rated the perceived relevance of 435 scientific papers presented at five top RE-related conferences. The 153 participants provided a total of 2,164 ratings. The practitioners rated RE research as essential or worthwhile in a majority of cases. However, the percentage of non-positive ratings is still higher than we would like. Among the factors that affect the perception of relevance are the research's links to industry, the research method used, and respondents' roles. The reasons for positive perceptions were primarily related to the relevance of the problem and the soundness of the solution, while the causes for negative perceptions were more varied. The respondents also provided suggestions for future research, including topics researchers have studied for decades, like elicitation or requirement quality criteria.

SESep 3, 2020
What Makes Agile Test Artifacts Useful? An Activity-Based Quality Model from a Practitioners' Perspective

Jannik Fischbach, Henning Femmer, Daniel Mendez et al.

Background: The artifacts used in Agile software testing and the reasons why these artifacts are used are fairly well-understood. However, empirical research on how Agile test artifacts are eventually designed in practice and which quality factors make them useful for software testing remains sparse. Aims: Our objective is two-fold. First, we identify current challenges in using test artifacts to understand why certain quality factors are considered good or bad. Second, we build an Activity-Based Artifact Quality Model that describes what Agile test artifacts should look like. Method: We conduct an industrial survey with 18 practitioners from 12 companies operating in seven different domains. Results: Our analysis reveals nine challenges and 16 factors describing the quality of six test artifacts from the perspective of Agile testers. Interestingly, we observed mostly challenges regarding language and traceability, which are well-known to occur in non-Agile projects. Conclusions: Although Agile software testing is becoming the norm, we still have little confidence about general do's and don'ts going beyond conventional wisdom. This study is the first to distill a list of quality factors deemed important to what can be considered as useful test artifacts.

HCJul 21, 2020
The Role and Potentials of Field User Interaction Data in the Automotive UX Development Lifecycle: An Industry Perspective

Patrick Ebel, Florian Brokhausen, Andreas Vogelsang

We are interested in the role of field user interaction data in the development of IVIS, the potentials practitioners see in analyzing this data, the concerns they share, and how this compares to companies with digital products. We conducted interviews with 14 UX professionals, 8 from automotive and 6 from digital companies, and analyzed the results by emergent thematic coding. Our key findings indicate that implicit feedback through field user interaction data is currently not evident in the automotive UX development process. Most decisions regarding the design of IVIS are made based on personal preferences and the intuitions of stakeholders. However, the interviewees also indicated that user interaction data has the potential to lower the influence of guesswork and assumptions in the UX design process and can help to make the UX development lifecycle more evidence-based and user-centered.

CLJul 10, 2020
Topic Modeling on User Stories using Word Mover's Distance

Kim Julian Gülle, Nicholas Ford, Patrick Ebel et al.

Requirements elicitation has recently been complemented with crowd-based techniques, which continuously involve large, heterogeneous groups of users who express their feedback through a variety of media. Crowd-based elicitation has great potential for engaging with (potential) users early on but also results in large sets of raw and unstructured feedback. Consolidating and analyzing this feedback is a key challenge for turning it into sensible user requirements. In this paper, we focus on topic modeling as a means to identify topics within a large set of crowd-generated user stories and compare three approaches: (1) a traditional approach based on Latent Dirichlet Allocation, (2) a combination of word embeddings and principal component analysis, and (3) a combination of word embeddings and Word Mover's Distance. We evaluate the approaches on a publicly available set of 2,966 user stories written and categorized by crowd workers. We found that a combination of word embeddings and Word Mover's Distance is most promising. Depending on the word embeddings we use in our approaches, we manage to cluster the user stories in two ways: one that is closer to the original categorization and another that allows new insights into the dataset, e.g. to find potentially new categories. Unfortunately, no measure exists to rate the quality of our results objectively. Still, our findings provide a basis for future work towards analyzing crowd-sourced user stories.

SEJul 7, 2020
Data-driven Risk Management for Requirements Engineering: An Automated Approach based on Bayesian Networks

Florian Wiesweg, Andreas Vogelsang, Daniel Mendez

Requirements Engineering (RE) is a means to reduce the risk of delivering a product that does not fulfill the stakeholders' needs. Therefore, a major challenge in RE is to decide how much RE is needed and what RE methods to apply. The quality of such decisions is strongly based on the RE expert's experience and expertise in carefully analyzing the context and current state of a project. Recent work, however, shows that lack of experience and qualification are common causes for problems in RE. We trained a series of Bayesian Networks on data from the NaPiRE survey to model relationships between RE problems, their causes, and effects in projects with different contextual characteristics. These models were used to conduct (1) a postmortem (diagnostic) analysis, deriving probable causes of suboptimal RE performance, and (2) to conduct a preventive analysis, predicting probable issues a young project might encounter. The method was subject to a rigorous cross-validation procedure for both use cases before assessing

SEJun 29, 2020
Towards Causality Extraction from Requirements

Jannik Fischbach, Benedikt Hauptmann, Lukas Konwitschny et al.

System behavior is often based on causal relations between certain events (e.g. If event1, then event2). Consequently, those causal relations are also textually embedded in requirements. We want to extract this causal knowledge and utilize it to derive test cases automatically and to reason about dependencies between requirements. Existing NLP approaches fail to extract causality from natural language (NL) with reasonable performance. In this paper, we describe first steps towards building a new approach for causality extraction and contribute: (1) an NLP architecture based on Tree Recursive Neural Networks (TRNN) that we will train to identify causal relations in NL requirements and (2) an annotation scheme and a dataset that is suitable for training TRNNs. Our dataset contains 212,186 sentences from 463 publicly available requirement documents and is a first step towards a gold standard corpus for causality extraction. We encourage fellow researchers to contribute to our dataset and help us in finalizing the causality annotation process. Additionally, the dataset can also be annotated further to serve as a benchmark for other RE-relevant NLP tasks such as requirements classification.

LGApr 16, 2020
Destination Prediction Based on Partial Trajectory Data

Patrick Ebel, Ibrahim Emre Göl, Christoph Lingenfelder et al.

Two-thirds of the people who buy a new car prefer to use a substitute instead of the built-in navigation system. However, for many applications, knowledge about a user's intended destination and route is crucial. For example, suggestions for available parking spots close to the destination can be made or ride-sharing opportunities along the route are facilitated. Our approach predicts probable destinations and routes of a vehicle, based on the most recent partial trajectory and additional contextual data. The approach follows a three-step procedure: First, a $k$-d tree-based space discretization is performed, mapping GPS locations to discrete regions. Secondly, a recurrent neural network is trained to predict the destination based on partial sequences of trajectories. The neural network produces destination scores, signifying the probability of each region being the destination. Finally, the routes to the most probable destinations are calculated. To evaluate the method, we compare multiple neural architectures and present the experimental results of the destination prediction. The experiments are based on two public datasets of non-personalized, timestamped GPS locations of taxi trips. The best performing models were able to predict the destination of a vehicle with a mean error of 1.3 km and 1.43 km respectively.

SEFeb 7, 2020
Views on Quality Requirements in Academia and Practice: Commonalities, Differences, and Context-Dependent Grey Areas

Andreas Vogelsang, Jonas Eckhardt, Daniel Mendez et al.

Context: Quality requirements (QRs) are a topic of constant discussions both in industry and academia. Debates entwine around the definition of quality requirements, the way how to handle them, or their importance for project success. While many academic endeavors contribute to the body of knowledge about QRs, practitioners may have different views. In fact, we still lack a consistent body of knowledge on QRs since much of the discussion around this topic is still dominated by observations that are strongly context-dependent. This holds for both academic and practitioners' views. Our assumption is that, in consequence, those views may differ. Objective: We report on a study to better understand the extent to which available research statements on quality requirements, as found in exemplary peer-reviewed and frequently cited publications, are reflected in the perception of practitioners. Our goal is to analyze differences, commonalities, and context-dependent grey areas in the views of academics and practitioners to allow a discussion on potential misconceptions (on either sides) and opportunities for future research. Method: We conducted a survey with 109 practitioners to assess whether they agree with research statements about QRs reflected in the literature. Based on a statistical model, we evaluate the impact of a set of context factors to the perception of research statements. Results: Our results show that a majority of the statements is well respected by practitioners; however, not all of them. When examining the different groups and backgrounds of respondents, we noticed interesting deviations of perceptions within different groups that may lead to new research questions. Conclusions: Our results help identifying prevalent context-dependent differences about how academics and practitioners view QRs and pinpointing statements where further research might be useful.

SEFeb 7, 2020
How do Quantifiers Affect the Quality of Requirements?

Katharina Winter, Henning Femmer, Andreas Vogelsang

Context: Requirements quality can have a substantial impact on the effectiveness and efficiency of using requirements artifacts in a development process. Quantifiers such as "at least", "all", or "exactly" are common language constructs used to express requirements. Quantifiers can be formulated by affirmative phrases ("At least") or negative phrases ("Not less than"). Problem: It is long assumed that negation in quantification negatively affects the readability of requirements, however, empirical research on these topics remains sparse. Principal Idea: In a web-based experiment with 51 participants, we compare the impact of negations and quantifiers on readability in terms of reading effort, reading error rate and perceived reading difficulty of requirements. Results: For 5 out of 9 quantifiers, our participants performed better on the affirmative phrase compared to the negative phrase. Only for one quantifier, the negative phrase was more effective. Contribution: This research focuses on creating an empirical understanding of the effect of language in Requirements Engineering. It furthermore provides concrete advice on how to phrase requirements.

SEAug 22, 2019
Automated Generation of Test Models from Semi-Structured Requirements

Jannik Fischbach, Maximilian Junker, Andreas Vogelsang et al.

[Context:] Model-based testing is an instrument for automated generation of test cases. It requires identifying requirements in documents, understanding them syntactically and semantically, and then translating them into a test model. One light-weight language for these test models are Cause-Effect-Graphs (CEG) that can be used to derive test cases. [Problem:] The creation of test models is laborious and we lack an automated solution that covers the entire process from requirement detection to test model creation. In addition, the majority of requirements is expressed in natural language (NL), which is hard to translate to test models automatically. [Principal Idea:] We build on the fact that not all NL requirements are equally unstructured. We found that 14 % of the lines in requirements documents of our industry partner contain "pseudo-code"-like descriptions of business rules. We apply Machine Learning to identify such semi-structured requirements descriptions and propose a rule-based approach for their translation into CEGs. [Contribution:] We make three contributions: (1) an algorithm for the automatic detection of semi-structured requirements descriptions in documents, (2) an algorithm for the automatic translation of the identified requirements into a CEG and (3) a study demonstrating that our proposed solution leads to 86 % time savings for test model creation without loss of quality.

AIAug 13, 2019
Towards Self-Explainable Cyber-Physical Systems

Mathias Blumreiter, Joel Greenyer, Francisco Javier Chiyah Garcia et al.

With the increasing complexity of CPSs, their behavior and decisions become increasingly difficult to understand and comprehend for users and other stakeholders. Our vision is to build self-explainable systems that can, at run-time, answer questions about the system's past, current, and future behavior. As hitherto no design methodology or reference framework exists for building such systems, we propose the MAB-EX framework for building self-explainable systems that leverage requirements- and explainability models at run-time. The basic idea of MAB-EX is to first Monitor and Analyze a certain behavior of a system, then Build an explanation from explanation models and convey this EXplanation in a suitable way to a stakeholder. We also take into account that new explanations can be learned, by updating the explanation models, should new and yet un-explainable behavior be detected by the system.

LGAug 13, 2019
Requirements Engineering for Machine Learning: Perspectives from Data Scientists

Andreas Vogelsang, Markus Borg

Machine learning (ML) is used increasingly in real-world applications. In this paper, we describe our ongoing endeavor to define characteristics and challenges unique to Requirements Engineering (RE) for ML-based systems. As a first step, we interviewed four data scientists to understand how ML experts approach elicitation, specification, and assurance of requirements and expectations. The results show that changes in the development paradigm, i.e., from coding to training, also demands changes in RE. We conclude that development of ML systems demands requirements engineers to: (1) understand ML performance measures to state good functional requirements, (2) be aware of new quality requirements such as explainability, freedom from discrimination, or specific legal requirements, and (3) integrate ML specifics in the RE process. Our study provides a first contribution towards an RE methodology for ML systems.

SEFeb 25, 2019
Microservice Architectures for Advanced Driver Assistance Systems: A Case-Study

Jannik Lotz, Andreas Vogelsang, Ola Benderius et al.

The technological advancements of recent years have steadily increased the complexity of vehicle-internal software systems, and the ongoing development towards autonomous driving will further aggravate this situation. This is leading to a level of complexity that is pushing the limits of existing vehicle software architectures and system designs. By changing the software structure to a service-based architecture, companies in other domains successfully managed the rising complexity and created a more agile and future-oriented development process. This paper presents a case-study investigating the feasibility and possible effects of changing the software architecture for a complex driver assistance function to a microservice architecture. The complete procedure is described, starting with the description of the software-environment and the corresponding requirements, followed by the implementation, and the final testing. In addition, this paper provides a high-level evaluation of the microservice architecture for the automotive use-case. The results show that microservice architectures can reduce complexity and time-consuming process steps and makes the automotive software systems prepared for upcoming challenges as long as the principles of microservice architectures are carefully followed.

SESep 1, 2017
Should I Stay or Should I Go? On Forces that Drive and Prevent MBSE Adoption in the Embedded Systems Industry

Andreas Vogelsang, Tiago Amorim, Florian Pudlitz et al.

[Context] Model-based Systems Engineering (MBSE) comprises a set of models and techniques that is often suggested as solution to cope with the challenges of engineering complex systems. Although many practitioners agree with the arguments on the potential benefits of the techniques, companies struggle with the adoption of MBSE. [Goal] In this paper, we investigate the forces that prevent or impede the adoption of MBSE in companies that develop embedded software systems. We contrast the hindering forces with issues and challenges that drive these companies towards introducing MBSE. [Method] Our results are based on 20 interviews with experts from 10 companies. Through exploratory research, we analyze the results by means of thematic coding. [Results] Forces that prevent MBSE adoption mainly relate to immature tooling, uncertainty about the return-on-investment, and fears on migrating existing data and processes. On the other hand, MBSE adoption also has strong drivers and participants have high expectations mainly with respect to managing complexity, adhering to new regulations, and reducing costs. [Conclusions] We conclude that bad experiences and frustration about MBSE adoption originate from false or too high expectations. Nevertheless, companies should not underestimate the necessary efforts for convincing employees and addressing their anxiety.

SEAug 29, 2017
Why feature dependencies challenge the requirements engineering of automotive systems: An empirical study

Andreas Vogelsang, Steffen Fuhrmann

Functional dependencies and feature interactions in automotive software systems are a major source of erroneous and deficient behavior. To overcome these problems, many approaches exist that focus on modeling these functional dependencies in early stages of system design. However, there are only few empirical studies that report on the extent of such dependencies in industrial software systems and how they are considered in an industrial development context. In this paper, we analyze the functional architecture of a real automotive software system with the aim to assess the extent, awareness and importance of interactions between features of a future vehicle. Our results show that within the functional architecture at least 85% of the analyzed vehicle features depend on each other. They furthermore show that the developers are not aware of a large number of these dependencies when they are modeled solely on an architectural level. Therefore, the developers mention the need for a more precise specification of feature interactions, e.g., for the execution of comprehensive impact analyses. These results challenge the current development methods and emphasize the need for an extensive modeling of features and their dependencies in requirements engineering.

SEFeb 24, 2017
How to specify Non-functional Requirements to support seamless modeling? A Study Design and Preliminary Results

Jonas Eckhardt, Daniel Méndez Fernández, Andreas Vogelsang

Context: Seamless model-based development provides integrated chains of models, covering all software engineering phases. Non-functional requirements (NFRs), like reusability, further play a vital role in software and systems engineering, but are often neglected in research and practice. It is still unclear how to integrate NFRs in a seamless model-based development. Goal: Our long-term goal is to develop a theory on the specification of NFRs such that they can be integrated in seamless model-based development. Method: Our overall study design includes a multi-staged procedure to infer an empirically founded theory on specifying NFRs to support seamless modeling. In this short paper, we present the study design and provide a discussion of (i) preliminary results obtained from a sample, and (ii) current issues related to the design. Results: Our study already shows significant fields of improvement, e.g., the low agreement during the classification. However, the results indicate to interesting points; for example, many of commonly used NFR classes concern system modeling concepts in a way that shows how blurry the borders between functional and NFRs are. Conclusions: We conclude so far that our overall study design seems suitable to obtain the envisioned theory in the long run, but we could also show current issues that are worth discussing within the empirical software engineering community. The main goal of this contribution is not to present and discuss current results only, but to foster discussions on the issues related to the integration of NFRs in seamless modeling in general and, in particular, discussions on open methodological issues.