Daniel Mendez

SE
h-index47
28papers
182citations
Novelty24%
AI Score45

28 Papers

SEApr 22
Quo Vadis, Code Review? Exploring the Future of Code Review

Michael Dorner, Andreas Bauer, Darja Šmite et al.

Context: Code review has long been a core practice in collaborative software engineering. As automation becomes increasingly embedded in development workflows, the role and functioning of code review are subject to change. Objective: This study explores how professional developers anticipate the evolution of code review and identifies emerging tensions reflected in these expectations. Method: We conducted a cross-sectional survey with 100 developers across five software-driven companies. The survey captured estimates of current review time and reviewed artifacts, as well as anticipated changes over a five-year horizon. Open-ended questions invited reflections on the future of code review. Quantitative responses were analyzed descriptively, and open-ended responses were independently coded by multiple researchers using thematic analysis to identify recurring patterns in participant responses. Results: Practitioners expect code review to remain essential, anticipating stable or increased time investment and a broader range of reviewed artifacts over the next five years. In open-ended responses, many participants explicitly referenced AI and large language models (LLMs), describing increasing automation in both code authoring and reviewing, including scenarios in which automated systems operate in both roles. Conclusion: Our analysis suggests emerging tensions concerning understanding, accountability, and trust in automation-mediated code review. These tensions provide early empirical signals of socio-technical challenges and position code review as a concrete setting for examining the implications of LLM integration in collaborative software engineering.

SEApr 1
An Empirical Study of Generative AI Adoption in Software Engineering

Görkem Giray, Onur Demirörs, Marcos Kalinowski et al.

Context. GenAI tools are being increasingly adopted by practitioners in SE, promising support for several SE activities. Despite increasing adoption, we still lack empirical evidence on how GenAI is used in practice, the benefits it provides, the challenges it introduces, and its broader organizational and societal implications. Objective. This study aims to provide an overview of the status of GenAI adoption in SE. It investigates the status of GenAI adoption, associated benefits and challenges, institutionalization of tools and techniques, and anticipated long term impacts on SE professionals and the community. Results. The results indicate a wide adoption of GenAI tools and how they are deeply integrated into daily SE work, particularly for implementation, verification and validation, personal assistance, and maintenance-related tasks. Practitioners report substantial benefits, most notably reduction in cycle time, quality improvements, enhanced support in knowledge work, and productivity gains. However, objective measurement of productivity and quality remains limited in practice. Significant challenges persist, including incorrect or unreliable outputs, prompt engineering difficulties, validation overhead, security and privacy concerns, and risks of overreliance. Institutionalization of tools and techniques seems to be common, but it varies considerably, with a strong focus on tool access and less emphasis on training and governance. Practitioners expect GenAI to redefine rather than replace their roles, while expressing moderate concern about job market contraction and skill shifts.

SEMay 12
A Research Agenda on Agents and Software Engineering: Outcomes from the Rio A2SE Seminar

Davide Taibi, Henry Muccini, Karthik Vaidhyanathan et al.

The rise of agentic AI is reshaping software engineering in two intertwined directions: agents are increasingly applied to support software engineering tasks, and Agentic AI systems themselves are complex systems that require re-thinking currently established software engineering practices. To chart a coherent research agenda covering the two directions, we organized the A2SE seminar in Rio de Janeiro, bringing together 18 experts from academia and industry. Through structured presentations, collaborative topic clustering, and focused group discussions, participants identified six thematic areas: Governance, Software Engineering for Agents, Agents for Software Architecture, Quality and Evaluation, Sustainability, and Code, and they prioritized short-term and long-term research directions for each. This paper presents the resulting community-driven, opinionated research agenda, offering the SE community a structured foundation for coordinating efforts at this critical juncture.

SEFeb 10, 2021Code
Enterprise-Driven Open Source Software: A Case Study on Security Automation

Florian Angermeir, Markus Voggenreiter, Fabiola Moyón et al.

Agile and DevOps are widely adopted by the industry. Hence, integrating security activities with industrial practices, such as continuous integration (CI) pipelines, is necessary to detect security flaws and adhere to regulators' demands early. In this paper, we analyze automated security activities in CI pipelines of enterprise-driven open source software (OSS). This shall allow us, in the long-run, to better understand the extent to which security activities are (or should be) part of automated pipelines. In particular, we mine publicly available OSS repositories and survey a sample of project maintainers to better understand the role that security activities and their related tools play in their CI pipelines. To increase transparency and allow other researchers to replicate our study (and to take different perspectives), we further disclose our research artefacts. Our results indicate that security activities in enterprise-driven OSS projects are scarce and protection coverage is rather low. Only 6.83% of the analyzed 8,243 projects apply security automation in their CI pipelines, even though maintainers consider security to be rather important. This alerts industry to keep the focus on vulnerabilities of 3rd Party software and it opens space for other improvements of practice which we outline in this manuscript.

SEMay 20, 2024
Naming the Pain in Machine Learning-Enabled Systems Engineering

Marcos Kalinowski, Daniel Mendez, Görkem Giray et al.

Context: Machine learning (ML)-enabled systems are being increasingly adopted by companies aiming to enhance their products and operational processes. Objective: This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems and lay the foundation to steer practically relevant and problem-driven academic research. Method: We conducted an international survey to collect insights from practitioners on the current practices and problems in engineering ML-enabled systems. We received 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems using open and axial coding procedures. Results: Our survey results reinforce and extend existing empirical evidence on engineering ML-enabled systems, providing additional insights into typical ML-enabled systems project contexts, the perceived relevance and complexity of ML life cycle phases, and current practices related to problem understanding, model deployment, and model monitoring. Furthermore, the qualitative analysis provides a detailed map of the problems practitioners face within each ML life cycle phase and the problems causing overall project failure. Conclusions: The results contribute to a better understanding of the status quo and problems in practical environments. We advocate for the further adaptation and dissemination of software engineering practices to enhance the engineering of ML-enabled systems.

SEOct 29, 2025
Reflections on the Reproducibility of Commercial LLM Performance in Empirical Software Engineering Studies

Florian Angermeir, Maximilian Amougou, Mark Kreitz et al.

Large Language Models have gained remarkable interest in industry and academia. The increasing interest in LLMs in academia is also reflected in the number of publications on this topic over the last years. For instance, alone 78 of the around 425 publications at ICSE 2024 performed experiments with LLMs. Conducting empirical studies with LLMs remains challenging and raises questions on how to achieve reproducible results, for both researchers and practitioners. One important step towards excelling in empirical research on LLM and their application is to first understand to what extent current research results are eventually reproducible and what factors may impede reproducibility. This investigation is within the scope of our work. We contribute an analysis of the reproducibility of LLM-centric studies, provide insights into the factors impeding reproducibility, and discuss suggestions on how to improve the current state. In particular, we studied the 85 articles describing LLM-centric studies, published at ICSE 2024 and ASE 2024. Of the 85 articles, 18 provided research artefacts and used OpenAI models. We attempted to replicate those 18 studies. Of the 18 studies, only five were sufficiently complete and executable. For none of the five studies, we were able to fully reproduce the results. Two studies seemed to be partially reproducible, and three studies did not seem to be reproducible. Our results highlight not only the need for stricter research artefact evaluations but also for more robust study designs to ensure the reproducible value of future publications.

SEFeb 2, 2022
Automatic Creation of Acceptance Tests by Extracting Conditionals from Requirements: NLP Approach and Case Study

Jannik Fischbach, Julian Frattini, Andreas Vogelsang et al.

Acceptance testing is crucial to determine whether a system fulfills end-user requirements. However, the creation of acceptance tests is a laborious task entailing two major challenges: (1) practitioners need to determine the right set of test cases that fully covers a requirement, and (2) they need to create test cases manually due to insufficient tool support. Existing approaches for automatically deriving test cases require semi-formal or even formal notations of requirements, though unrestricted natural language is prevalent in practice. In this paper, we present our tool-supported approach CiRA (Conditionals in Requirements Artifacts) capable of creating the minimal set of required test cases from conditional statements in informal requirements. We demonstrate the feasibility of CiRA in a case study with three industry partners. In our study, out of 578 manually created test cases, 71.8 % can be generated automatically. Additionally, CiRA discovered 80 relevant test cases that were missed in manual test case design. CiRA is publicly available at www.cira.bth.se/demo/.

SEDec 15, 2021
Causality in Requirements Artifacts: Prevalence, Detection, and Impact

Julian Frattini, Jannik Fischbach, Daniel Mendez et al.

Background: Causal relations in natural language (NL) requirements convey strong, semantic information. Automatically extracting such causal information enables multiple use cases, such as test case generation, but it also requires to reliably detect causal relations in the first place. Currently, this is still a cumbersome task as causality in NL requirements is still barely understood and, thus, barely detectable. Objective: In our empirically informed research, we aim at better understanding the notion of causality and supporting the automatic extraction of causal relations in NL requirements. Method: In a first case study, we investigate 14.983 sentences from 53 requirements documents to understand the extent and form in which causality occurs. Second, we present and evaluate a tool-supported approach, called CiRA, for causality detection. We conclude with a second case study where we demonstrate the applicability of our tool and investigate the impact of causality on NL requirements. Results: The first case study shows that causality constitutes around 28% of all NL requirements sentences. We then demonstrate that our detection tool achieves a macro-F1 score of 82% on real-world data and that it outperforms related approaches with an average gain of 11.06% in macro-Recall and 11.43% in macro-Precision. Finally, our second case study corroborates the positive correlations of causality with features of NL requirements. Conclusion: The results strengthen our confidence in the eligibility of causal relations for downstream reuse, while our tool and publicly available data constitute a first step in the ongoing endeavors of utilizing causality in RE and beyond.

SEDec 10, 2021
Combining Design Thinking and Software Requirements Engineering to create Human-centered Software-intensive Systems

Jennifer Hehn, Daniel Mendez

Effective Requirements Engineering is a crucial activity in softwareintensive development projects. The human-centric working mode of Design Thinking is considered a powerful way to complement such activities when designing innovative systems. Research has already made great strides to illustrate the benefits of using Design Thinking for Requirements Engineering. However, it has remained mostly unclear how to actually realize a combination of both. In this chapter, we contribute an artifact-based model that integrates Design Thinking and Requirements Engineering for innovative software-intensive systems. Drawing from our research and project experiences, we suggest three strategies for tailoring and integrating Design Thinking and Requirements Engineering with complementary synergies.

SEOct 14, 2021
Only Time Will Tell: Modelling Information Diffusion in Code Review with Time-Varying Hypergraphs

Michael Dorner, Darja Šmite, Daniel Mendez et al.

Background: Modern code review is expected to facilitate knowledge sharing: All relevant information, the collective expertise, and meta-information around the code change and its context become evident, transparent, and explicit in the corresponding code review discussion. The discussion participants can leverage this information in the following code reviews; the information diffuses through the communication network that emerges from code review. Traditional time-aggregated graphs fall short in rendering information diffusion as those models ignore the temporal order of the information exchange: Information can only be passed on if it is available in the first place. Aim: This manuscript presents a novel model based on time-varying hypergraphs for rendering information diffusion that overcomes the inherent limitations of traditional, time-aggregated graph-based models. Method: In an in-silico experiment, we simulate an information diffusion within the internal code review at Microsoft and show the empirical impact of time on a key characteristic of information diffusion: the number of reachable participants. Results: Time-aggregation significantly overestimates the paths of information diffusion available in communication networks and, thus, is neither precise nor accurate for modelling and measuring the spread of information within communication networks that emerge from code review. Conclusion: Our model overcomes the inherent limitations of traditional, static or time-aggregated, graph-based communication models and sheds the first light on information diffusion through code review. We believe that our model can serve as a foundation for understanding, measuring, managing, and improving knowledge sharing in code review in particular and information diffusion in software engineering in general.

SESep 5, 2021
How Do Practitioners Interpret Conditionals in Requirements?

Jannik Fischbach, Julian Frattini, Daniel Mendez et al.

Context: Conditional statements like "If A and B then C" are core elements for describing software requirements. However, there are many ways to express such conditionals in natural language and also many ways how they can be interpreted. We hypothesize that conditional statements in requirements are a source of ambiguity, potentially affecting downstream activities such as test case generation negatively. Objective: Our goal is to understand how specific conditionals are interpreted by readers who work with requirements. Method: We conduct a descriptive survey with 104 RE practitioners and ask how they interpret 12 different conditional clauses. We map their interpretations to logical formulas written in Propositional (Temporal) Logic and discuss the implications. Results: The conditionals in our tested requirements were interpreted ambiguously. We found that practitioners disagree on whether an antecedent is only sufficient or also necessary for the consequent. Interestingly, the disagreement persists even when the system behavior is known to the practitioners. We also found that certain cue phrases are associated with specific interpretations. Conclusion: Conditionals in requirements are a source of ambiguity and there is not just one way to interpret them formally. This affects any analysis that builds upon formalized requirements (e.g., inconsistency checking, test-case generation). Our results may also influence guidelines for writing requirements.

SEAug 30, 2021
Vision for an Artefact-based Approach to Regulatory Requirements Engineering

Oleksandr Kosenkov, Michael Unterkalmsteiner, Daniel Mendez et al.

Background: Nowadays, regulatory requirements engineering (regulatory RE) faces challenges of interdisciplinary nature that cannot be tackled due to existing research gaps. Aims: We envision an approach to solve some of the challenges related to the nature and complexity of regulatory requirements, the necessity for domain knowledge, and the involvement of legal experts in regulatory RE. Method: We suggest the qualitative analysis of regulatory texts combined with the further case study to develop an empirical foundation for our research. Results: We outline our vision for the application of extended artefact-based modeling for regulatory RE. Conclusions: Empirical methodology is an essential instrument to address interdisciplinarity and complexity in regulatory RE. Artefact-based modeling supported by empirical results can solve a particular set of problems while not limiting the application of other methods and tools and facilitating the interaction between different fields of practice and research.

CLJul 21, 2021
Fine-Grained Causality Extraction From Natural Language Requirements Using Recursive Neural Tensor Networks

Jannik Fischbach, Tobias Springer, Julian Frattini et al.

[Context:] Causal relations (e.g., If A, then B) are prevalent in functional requirements. For various applications of AI4RE, e.g., the automatic derivation of suitable test cases from requirements, automatically extracting such causal statements are a basic necessity. [Problem:] We lack an approach that is able to extract causal relations from natural language requirements in fine-grained form. Specifically, existing approaches do not consider the combinatorics between causes and effects. They also do not allow to split causes and effects into more granular text fragments (e.g., variable and condition), making the extracted relations unsuitable for automatic test case derivation. [Objective & Contributions:] We address this research gap and make the following contributions: First, we present the Causality Treebank, which is the first corpus of fully labeled binary parse trees representing the composition of 1,571 causal requirements. Second, we propose a fine-grained causality extractor based on Recursive Neural Tensor Networks. Our approach is capable of recovering the composition of causal statements written in natural language and achieves a F1 score of 74 % in the evaluation on the Causality Treebank. Third, we disclose our open data sets as well as our code to foster the discourse on the automatic extraction of causality in the RE community.

SEMay 28, 2021
A Study about the Knowledge and Use of Requirements Engineering Standards in Industry

Xavier Franch, Martin Glinz, Daniel Mendez et al.

Context: The use of standards is considered a vital part of any engineering discipline. So one could expect that standards play an important role in Requirements Engineering (RE) as well. However, little is known about the actual knowledge and use of RE-related standards in industry. Objective: In this article, we investigate to which extent standards and related artifacts such as templates or guidelines are known and used by RE practitioners. Method: To this end, we have conducted a questionnaire-based online survey. We could analyze the replies from 90 RE practitioners using a combination of closed and open-text questions. Results: Our results indicate that the knowledge and use of standards and related artifacts in RE is less widespread than one might expect from an engineering perspective. For example, about 47% of the respondents working as requirements engineers or business analysts do not know the core standard in RE, ISO/IEC/IEEE 29148. Participants in our study mostly use standards by personal decision rather than being imposed by their respective company, customer, or regulator. Beyond insufficient knowledge, we also found cultural and organizational factors impeding the widespread adoption of standards in RE. Conclusions: Overall, our results provide empirically informed insights into the actual use of standards and related artifacts in RE practice and - indirectly - about the value that the current standards create for RE practitioners.

SEApr 8, 2021
The human side of Software Engineering Teams: an investigation of contemporary challenges

Marco Hoffmann, Daniel Mendez, Fabian Fagerholm et al.

There have been recent calls for research on the human side of software engineering and its impact on various factors such as productivity, developer happiness and project success. An analysis of which challenges in software engineering teams are most frequent is still missing. We aim to provide a starting point for a theory about relevant human challenges and their causes in software engineering. We establish a reusable set of challenges and start out by investigating the effect of team virtualization. Virtual teams often use digital communication and consist of members with different nationalities. We designed a survey instrument and asked respondents to assess the frequency and criticality of a set of challenges, separated in context "within teams" as well as "between teams and clients", compiled from previous empiric work, blog posts and pilot survey feedback. For the team challenges, we asked if mitigation measures were already in place. Respondents were also asked to provide information about their team setup. The survey also measured Schwartz human values. Finally, respondents were asked if there were additional challenges at their workplace. We report on the results obtained from 192 respondents. We present a set of challenges that takes the survey feedback into account and introduce two categories of challenges; "interpersonal" and "intrapersonal". We found no evidence for links between human values and challenges. We found some significant links between the number of distinct nationalities in a team and certain challenges, with less frequent and critical challenges occurring if 2-3 different nationalities were present compared to a team having members of just one nationality or more than three. A higher degree of virtualization seems to increase the frequency of some human challenges.

SEMar 9, 2021
Towards Artefact-based Requirements Engineering for Data-Centric Systems

Tatiana Chuprina, Daniel Mendez, Krzysztof Wnuk

Many modern software-intensive systems employ artificial intelligence / machine-learning (AI/ML) components and are, thus, inherently data-centric. The behaviour of such systems depends on typically large amounts of data processed at run-time rendering such non-deterministic systems as complex. This complexity growth affects our understanding on needs and practices in Requirements Engineering (RE). There is, however, still little guidance on how to handle requirements for such systems effectively: What are, for example, typical quality requirements classes? What modelling concepts do we rely on or which levels of abstraction do we need to consider? In fact, how to integrate such concepts into approaches for a more traditional RE still needs profound investigations. In this research preview paper, we report on ongoing efforts to establish an artefact-based RE approach for the development of datacentric systems (DCSs). To this end, we sketch a DCS development process with the newly proposed requirements categories and data-centric artefacts and briefly report on an ongoing investigation of current RE challenges in industry developing data-centric systems.

SEMar 3, 2021
On Understanding the Relation of Knowledge and Confidence to Requirements Quality

Razieh Dehghani, Krzysztof Wnuk, Daniel Mendez et al.

Context and Motivation: Software requirements are affected by the knowledge and confidence of software engineers. Analyzing the interrelated impact of these factors is difficult because of the challenges of assessing knowledge and confidence. Question/Problem: This research aims to draw attention to the need for considering the interrelated effects of confidence and knowledge on requirements quality, which has not been addressed by previous publications. Principal ideas/results: For this purpose, the following steps have been taken: 1) requirements quality was defined based on the instructions provided by the ISO29148:2011 standard, 2) we selected the symptoms of low qualified requirements based on ISO29148:2011, 3) we analyzed five Software Requirements Specification (SRS) documents to find these symptoms, 3) people who have prepared the documents were categorized in four classes to specify the more/less knowledge and confidence they have regarding the symptoms, and 4) finally, the relation of lack of enough knowledge and confidence to symptoms of low quality was investigated. The results revealed that the simultaneous deficiency of confidence and knowledge has more negative effects in comparison with a deficiency of knowledge or confidence. Contribution: In brief, this study has achieved these results: 1) the realization that a combined lack of knowledge and confidence has a larger effect on requirements quality than only one of the two factors, 2) the relation between low qualified requirements and requirements engineers' needs for knowledge and confidence, and 3) variety of requirements engineers' needs for knowledge based on their abilities to make discriminative and consistent decisions.

SEMar 2, 2021
Compliance Requirements in Large-Scale Software Development: An Industrial Case Study

Muhammad Usman, Michael Felderer, Michael Unterkalmsteiner et al.

Regulatory compliance is a well-studied area, including research on how to model, check, analyse, enact, and verify compliance of software. However, while the theoretical body of knowledge is vast, empirical evidence on challenges with regulatory compliance, as faced by industrial practitioners particularly in the Software Engineering domain, is still lacking. In this paper, we report on an industrial case study which aims at providing insights into common practices and challenges with checking and analysing regulatory compliance, and we discuss our insights in direct relation to the state of reported evidence. Our study is performed at Ericsson AB, a large telecommunications company, which must comply to both locally and internationally governing regulatory entities and standards such as GDPR. The main contributions of this work are empirical evidence on challenges experienced by Ericsson that complement the existing body of knowledge on regulatory compliance.

SEFeb 10, 2021
Is Secure Coding Education in the Industry Needed? An Investigation Through a Large Scale Survey

Tiago Espinha Gasiba, Ulrike Lechner, Maria Pinto-Albuquerque et al.

The Department of Homeland Security in the United States estimates that 90% of software vulnerabilities can be traced back to defects in design and software coding. The financial impact of these vulnerabilities has been shown to exceed 380 million USD in industrial control systems alone. Since software developers write software, they also introduce these vulnerabilities into the source code. However, secure coding guidelines exist to prevent software developers from writing vulnerable code. This study focuses on the human factor, the software developer, and secure coding, in particular secure coding guidelines. We want to understand the software developers' awareness and compliance to secure coding guidelines and why, if at all, they aren't compliant or aware. We base our results on a large-scale survey on secure coding guidelines, with more than 190 industrial software developers. Our work's main contribution motivates the need to educate industrial software developers on secure coding guidelines, and it gives a list of fifteen actionable items to be used by practitioners in the industry. We also make our raw data openly available for further research.

SEJan 26, 2021
Automatic Detection of Causality in Requirement Artifacts: the CiRA Approach

Jannik Fischbach, Julian Frattini, Arjen Spaans et al.

System behavior is often expressed by causal relations in requirements (e.g., If event 1, then event 2). Automatically extracting this embedded causal knowledge supports not only reasoning about requirements dependencies, but also various automated engineering tasks such as seamless derivation of test cases. However, causality extraction from natural language is still an open research challenge as existing approaches fail to extract causality with reasonable performance. We understand causality extraction from requirements as a two-step problem: First, we need to detect if requirements have causal properties or not. Second, we need to understand and extract their causal relations. At present, though, we lack knowledge about the form and complexity of causality in requirements, which is necessary to develop a suitable approach addressing these two problems. We conduct an exploratory case study with 14,983 sentences from 53 requirements documents originating from 18 different domains and shed light on the form and complexity of causality in requirements. Based on our findings, we develop a tool-supported approach for causality detection (CiRA). This constitutes a first step towards causality extraction from NL requirements. We report on a case study and the resulting tool-supported approach for causality detection in requirements. Our case study corroborates, among other things, that causality is, in fact, a widely used linguistic pattern to describe system behavior, as about a third of the analyzed sentences are causal. We further demonstrate that our tool CiRA achieves a macro-F1 score of 82 % on real word data and that it outperforms related approaches with an average gain of 11.06 % in macro-Recall and 11.43 % in macro-Precision. Finally, we disclose our open data sets as well as our tool to foster the discourse on the automatic detection of causality in the RE community.

SEJan 19, 2021
Assets in Software Engineering: What are they after all?

Ehsan Zabardast, Julian Frattini, Javier Gonzalez-Huerta et al.

During the development and maintenance of software-intensive products or services, we depend on various artefacts. Some of those artefacts, we deem central to the feasibility of a project and the product's final quality. Typically, these central artefacts are referred to as assets. However, despite their central role in the software development process, little thought is yet invested into what eventually characterises as an asset, often resulting in many terms and underlying concepts being mixed and used inconsistently. A precise terminology of assets and related concepts, such as asset degradation, are crucial for setting up a new generation of cost-effective software engineering practices. In this position paper, we critically reflect upon the notion of assets in software engineering. As a starting point, we define the terminology and concepts of assets and extend the reasoning behind them. We explore assets' characteristics and discuss what asset degradation is as well as its various types and the implications that asset degradation might bring for the planning, realisation, and evolution of software-intensive products and services over time. We aspire to contribute to a more standardised definition of assets in software engineering and foster research endeavours and their practical dissemination in a common, more unified direction.

SENov 10, 2020
How do Practitioners Perceive the Relevance of Requirements Engineering Research?

Xavier Franch, Daniel Mendez, Andreas Vogelsang et al.

The relevance of Requirements Engineering (RE) research to practitioners is vital for a long-term dissemination of research results to everyday practice. Some authors have speculated about a mismatch between research and practice in the RE discipline. However, there is not much evidence to support or refute this perception. This paper presents the results of a study aimed at gathering evidence from practitioners about their perception of the relevance of RE research and at understanding the factors that influence that perception. We conducted a questionnaire-based survey of industry practitioners with expertise in RE. The participants rated the perceived relevance of 435 scientific papers presented at five top RE-related conferences. The 153 participants provided a total of 2,164 ratings. The practitioners rated RE research as essential or worthwhile in a majority of cases. However, the percentage of non-positive ratings is still higher than we would like. Among the factors that affect the perception of relevance are the research's links to industry, the research method used, and respondents' roles. The reasons for positive perceptions were primarily related to the relevance of the problem and the soundness of the solution, while the causes for negative perceptions were more varied. The respondents also provided suggestions for future research, including topics researchers have studied for decades, like elicitation or requirement quality criteria.

SEOct 7, 2020
Empirical Standards for Software Engineering Research

Paul Ralph, Nauman bin Ali, Sebastian Baltes et al.

Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around research best practices, will improve research quality and make peer review more effective, reliable, transparent and fair.

SESep 3, 2020
What Makes Agile Test Artifacts Useful? An Activity-Based Quality Model from a Practitioners' Perspective

Jannik Fischbach, Henning Femmer, Daniel Mendez et al.

Background: The artifacts used in Agile software testing and the reasons why these artifacts are used are fairly well-understood. However, empirical research on how Agile test artifacts are eventually designed in practice and which quality factors make them useful for software testing remains sparse. Aims: Our objective is two-fold. First, we identify current challenges in using test artifacts to understand why certain quality factors are considered good or bad. Second, we build an Activity-Based Artifact Quality Model that describes what Agile test artifacts should look like. Method: We conduct an industrial survey with 18 practitioners from 12 companies operating in seven different domains. Results: Our analysis reveals nine challenges and 16 factors describing the quality of six test artifacts from the perspective of Agile testers. Interestingly, we observed mostly challenges regarding language and traceability, which are well-known to occur in non-Agile projects. Conclusions: Although Agile software testing is becoming the norm, we still have little confidence about general do's and don'ts going beyond conventional wisdom. This study is the first to distill a list of quality factors deemed important to what can be considered as useful test artifacts.

SESep 2, 2020
Understanding Peer Review of Software Engineering Papers

Neil A. Ernst, Jeffrey C. Carver, Daniel Mendez et al.

Peer review is a key activity intended to preserve the quality and integrity of scientific publications. However, in practice it is far from perfect. We aim at understanding how reviewers, including those who have won awards for reviewing, perform their reviews of software engineering papers to identify both what makes a good reviewing approach and what makes a good paper. We first conducted a series of in-person interviews with well-respected reviewers in the software engineering field. Then, we used the results of those interviews to develop a questionnaire used in an online survey and sent out to reviewers from well-respected venues covering a number of software engineering disciplines, some of whom had won awards for their reviewing efforts. We analyzed the responses from the interviews and from 175 reviewers who completed the online survey (including both reviewers who had won awards and those who had not). We report on several descriptive results, including: 45% of award-winners are reviewing 20+ conference papers a year, while 28% of non-award winners conduct that many. 88% of reviewers are taking more than two hours on journal reviews. We also report on qualitative results. To write a good review, the important criteria were it should be factual and helpful, ranked above others such as being detailed or kind. The most important features of papers that result in positive reviews are clear and supported validation, an interesting problem, and novelty. Conversely, negative reviews tend to result from papers that have a mismatch between the method and the claims and from those with overly grandiose claims. The main recommendation for authors is to make the contribution of the work very clear in their paper. In addition, reviewers viewed data availability and its consistency as being important.

SEJul 7, 2020
Data-driven Risk Management for Requirements Engineering: An Automated Approach based on Bayesian Networks

Florian Wiesweg, Andreas Vogelsang, Daniel Mendez

Requirements Engineering (RE) is a means to reduce the risk of delivering a product that does not fulfill the stakeholders' needs. Therefore, a major challenge in RE is to decide how much RE is needed and what RE methods to apply. The quality of such decisions is strongly based on the RE expert's experience and expertise in carefully analyzing the context and current state of a project. Recent work, however, shows that lack of experience and qualification are common causes for problems in RE. We trained a series of Bayesian Networks on data from the NaPiRE survey to model relationships between RE problems, their causes, and effects in projects with different contextual characteristics. These models were used to conduct (1) a postmortem (diagnostic) analysis, deriving probable causes of suboptimal RE performance, and (2) to conduct a preventive analysis, predicting probable issues a young project might encounter. The method was subject to a rigorous cross-validation procedure for both use cases before assessing

SEFeb 7, 2020
Views on Quality Requirements in Academia and Practice: Commonalities, Differences, and Context-Dependent Grey Areas

Andreas Vogelsang, Jonas Eckhardt, Daniel Mendez et al.

Context: Quality requirements (QRs) are a topic of constant discussions both in industry and academia. Debates entwine around the definition of quality requirements, the way how to handle them, or their importance for project success. While many academic endeavors contribute to the body of knowledge about QRs, practitioners may have different views. In fact, we still lack a consistent body of knowledge on QRs since much of the discussion around this topic is still dominated by observations that are strongly context-dependent. This holds for both academic and practitioners' views. Our assumption is that, in consequence, those views may differ. Objective: We report on a study to better understand the extent to which available research statements on quality requirements, as found in exemplary peer-reviewed and frequently cited publications, are reflected in the perception of practitioners. Our goal is to analyze differences, commonalities, and context-dependent grey areas in the views of academics and practitioners to allow a discussion on potential misconceptions (on either sides) and opportunities for future research. Method: We conducted a survey with 109 practitioners to assess whether they agree with research statements about QRs reflected in the literature. Based on a statistical model, we evaluate the impact of a set of context factors to the perception of research statements. Results: Our results show that a majority of the statements is well respected by practitioners; however, not all of them. When examining the different groups and backgrounds of respondents, we noticed interesting deviations of perceptions within different groups that may lead to new research questions. Conclusions: Our results help identifying prevalent context-dependent differences about how academics and practitioners view QRs and pinpointing statements where further research might be useful.

SEAug 20, 2019
On Integrating Design Thinking for a Human-centered Requirements Engineering

Jennifer Hehn, Daniel Mendez, Falk Uebernickel et al.

In this position paper, we elaborate on the possibilities and needs to integrate Design Thinking into Requirements Engineering. We draw from our research and project experiences to compare what is understood as Design Thinking and Requirements Engineering considering their involved artifacts. We suggest three approaches for tailoring and integrating Design Thinking and Requirements Engineering with complementary synergies and point at open challenges for research and practice.