CLSep 21, 2023
Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AIMahyar Abbasian, Elahe Khatibi, Iman Azimi et al.
Generative Artificial Intelligence is set to revolutionize healthcare delivery by transforming traditional patient care into a more personalized, efficient, and proactive process. Chatbots, serving as interactive conversational models, will probably drive this patient-centered transformation in healthcare. Through the provision of various services, including diagnosis, personalized lifestyle recommendations, and mental health support, the objective is to substantially augment patient health outcomes, all the while mitigating the workload burden on healthcare providers. The life-critical nature of healthcare applications necessitates establishing a unified and comprehensive set of evaluation metrics for conversational models. Existing evaluation metrics proposed for various generic large language models (LLMs) demonstrate a lack of comprehension regarding medical and health concepts and their significance in promoting patients' well-being. Moreover, these metrics neglect pivotal user-centered aspects, including trust-building, ethics, personalization, empathy, user comprehension, and emotional support. The purpose of this paper is to explore state-of-the-art LLM-based evaluation metrics that are specifically applicable to the assessment of interactive conversational models in healthcare. Subsequently, we present an comprehensive set of evaluation metrics designed to thoroughly assess the performance of healthcare chatbots from an end-user perspective. These metrics encompass an evaluation of language processing abilities, impact on real-world clinical tasks, and effectiveness in user-interactive conversations. Finally, we engage in a discussion concerning the challenges associated with defining and implementing these metrics, with particular emphasis on confounding factors such as the target audience, evaluation methods, and prompt techniques involved in the evaluation process.
SEJul 21, 2018
ELICA: An Automated Tool for Dynamic Extraction of Requirements Relevant InformationZahra Shakeri Hossein Abad, Vincenzo Gervasi, Didar Zowghi et al.
Requirements elicitation requires extensive knowledge and deep understanding of the problem domain where the final system will be situated. However, in many software development projects, analysts are required to elicit the requirements from an unfamiliar domain, which often causes communication barriers between analysts and stakeholders. In this paper, we propose a requirements ELICitation Aid tool (ELICA) to help analysts better understand the target application domain by dynamic extraction and labeling of requirements-relevant knowledge. To extract the relevant terms, we leverage the flexibility and power of Weighted Finite State Transducers (WFSTs) in dynamic modeling of natural language processing tasks. In addition to the information conveyed through text, ELICA captures and processes non-linguistic information about the intention of speakers such as their confidence level, analytical tone, and emotions. The extracted information is made available to the analysts as a set of labeled snippets with highlighted relevant terms which can also be exported as an artifact of the Requirements Engineering (RE) process. The application and usefulness of ELICA are demonstrated through a case study. This study shows how pre-existing relevant information about the application domain and the information captured during an elicitation meeting, such as the conversation and stakeholders' intentions, can be captured and used to support analysts achieving their tasks.
SEJul 19, 2018
Loud and Interactive Paper Prototyping in Requirements Elicitation: What is it Good for?Zahra Shakeri Hossein Abad, Sania Moazzam, Christina Lo et al.
Requirements Engineering is a multidisciplinary and a human-centered process, therefore, the artifacts produced from RE are always error-prone. The most significant of these errors are missing or misunderstanding requirements. Information loss in RE could result in omitted logic in the software, which will be onerous to correct at the later stages of development. In this paper, we demonstrate and investigate how interactive and Loud Paper Prototyping (LPP) can be integrated to collect stakeholders' needs and expectations than interactive prototyping or face-to-face meetings alone. To this end, we conducted a case study of (1) 31 mobile application (App) development teams who applied either of interactive or loud prototyping and (2) 19 mobile App development teams who applied only the face-to-face meetings. From this study, we found that while using Silent Paper Prototyping (SPP) rather than No Paper Prototyping (NPP) is a more efficient technique to capture Non-Functional Requirements (NFRs), User Interface (UI) requirements, and existing requirements, LPP is more applicable to manage NFRs, UI requirements, as well as adding new requirements and removing/modifying the existing requirements. We also found that among LPP and SPP, LPP is more efficient to capture and influence Functional Requirements (FRs).
CYJul 10, 2018
Dynamic Visual Analytics for Elicitation Meetings with ELICAZahra Shakeri Hossein Abad, Munib Rahman, Abdullah Cheema et al.
Requirements elicitation can be very challenging in projects that require deep domain knowledge about the system at hand. As analysts have the full control over the elicitation process, their lack of knowledge about the system under study inhibits them from asking related questions and reduces the accuracy of requirements provided by stakeholders. We present ELICA, a generic interactive visual analytics tool to assist analysts during requirements elicitation process. ELICA uses a novel information extraction algorithm based on a combination of Weighted Finite State Transducers (WFSTs) (generative model) and SVMs (discriminative model). ELICA presents the extracted relevant information in an interactive GUI (including zooming, panning, and pinching) that allows analysts to explore which parts of the ongoing conversation (or specification document) match with the extracted information. In this demonstration, we show that ELICA is usable and effective in practice, and is able to extract the related information in real-time. We also demonstrate how carefully designed features in ELICA facilitate the interactive and dynamic process of information extraction.
SEMay 15, 2018
Task Interruption in Software Development Projects: What Makes some Interruptions More Disruptive than Others?Zahra Shakeri Hossein Abad, Oliver Karras, Kurt Schneider et al.
Multitasking has always been an inherent part of software development and is known as the primary source of interruptions due to task switching in software development teams. Developing software involves a mix of analytical and creative work, and requires a significant load on brain functions, such as working memory and decision making. Thus, task switching in the context of software development imposes a cognitive load that causes software developers to lose focus and concentration while working thereby taking a toll on productivity. To investigate the disruptiveness of task switching and interruptions in software development projects, and to understand the reasons for and perceptions of the disruptiveness of task switching we used a mixed-methods approach including a longitudinal data analysis on 4,910 recorded tasks of 17 professional software developers, and a survey of 132 software developers. We found that, compared to task-specific factors (e.g. priority, level, and temporal stage), contextual factors such as interruption type (e.g. self/external), time of day, and task type and context are a more potent determinant of task switching disruptiveness in software development tasks. Furthermore, while most survey respondents believe external interruptions are more disruptive than self-interruptions, the results of our retrospective analysis reveals otherwise. We found that self-interruptions (i.e. voluntary task switchings) are more disruptive than external interruptions and have a negative effect on the performance of the interrupted tasks. Finally, we use the results of both studies to provide a set of comparative vulnerability and interaction patterns which can be used as a mean to guide decision-making and forecasting the consequences of task switching in software development teams.
SEMay 15, 2018
Two Sides of the Same Coin: Software Developers' Perceptions of Task Switching and Task InterruptionZahra Shakeri Hossein Abad, Mohammad Noaeen, Didar Zowghi et al.
In the constantly evolving world of software development, switching back and forth between tasks has become the norm. While task switching often allows developers to perform tasks effectively and may increase creativity via the flexible pathway, there are also consequences to frequent task-switching. For high-momentum tasks like software development, "flow", the highly productive state of concentration, is paramount. Each switch distracts the developers' flow, requiring them to switch mental state and an additional immersion period to get back into the flow. However, the wasted time due to time fragmentation caused by task switching is largely invisible and unnoticed by developers and managers. We conducted a survey with 141 software developers to investigate their perceptions of differences between task switching and task interruption and to explore whether they perceive task switchings as disruptive as interruptions. We found that practitioners perceive considerable similarities between the disruptiveness of task switching (either planned or unplanned) and random interruptions. The high level of cognitive cost and low performance are the main consequences of task switching articulated by our respondents. Our findings broaden the understanding of flow change among software practitioners in terms of the characteristics and categories of disruptive switches as well as the consequences of interruptions caused by daily stand-up meetings.
SEJul 17, 2017
Learn More, Pay Less! Lessons Learned from Applying the Wizard-of-Oz Technique for Exploring Mobile App RequirementsZahra Shakeri Hossein Abad, Shane D. V. Sims, Abdullah Cheema et al.
Mobile apps have exploded in popularity, encouraging developers to provide content to the massive user base of the main app stores. Although there exist automated techniques that can classify user comments into various topics with high levels of precision, recent studies have shown that the top apps in the app stores do not have customer ratings that directly correlate with the app's success. This implies that no single requirements elicitation technique can cover the full depth required to produce a successful product and that applying alternative requirements gathering techniques can lead to success when these two are combined. Since user involvement has been found to be the most impactful contribution to project success, in this paper we will explore how the Wizard of Oz (WOz) technique and user reviews available in Google Play, can be integrated to produce a product that meets the demand of more stakeholders than either method alone. To compare the role of early interactive requirements specification and app reviews, we conducted two studies (i) a case study analysis on 13 mobile app development teams who used very early stages Requirements Engineering (RE) by applying WOz, and (ii) a study analyzing 40 (70, 592 reviews) similar mobile apps on Google Play. The results of both studies show that while each of WOz and app review analysis techniques can be applied to capture specific types of requirements, an integrated process including both methods would eliminate the communication gap between users and developers at early stages of the development process and mitigates the risk of requirements change in later stages.
SEJul 10, 2017
Choosing Requirements for Experimentation with User Interfaces of Requirements Modeling ToolsParisa Ghazi, Zahra Shakeri Hossein Abad, Martin Glinz
When designing a new presentation front-end called FlexiView for requirements modeling tools, we encountered a general problem: designing such an interface requires a lot of experimentation which is costly when the code of the tool needs to be adapted for every experiment. On the other hand, when using simplified user interface (UI) tools, the results are difficult to generalize. To improve this situation, we are developing a UI experimentation tool which is based on so-called ImitGraphs. ImitGraphs can act as a simple, but an accurate substitute for a modeling tool. In this paper, we define requirements for such a UI experimentation tool based on an analysis of the features of existing requirements modeling tools.
SEJul 7, 2017
What Works Better? A Study of Classifying RequirementsZahra Shakeri Hossein Abad, Oliver Karras, Parisa Ghazi et al.
Classifying requirements into functional requirements (FR) and non-functional ones (NFR) is an important task in requirements engineering. However, automated classification of requirements written in natural language is not straightforward, due to the variability of natural language and the absence of a controlled vocabulary. This paper investigates how automated classification of requirements into FR and NFR can be improved and how well several machine learning approaches work in this context. We contribute an approach for preprocessing requirements that standardizes and normalizes requirements before applying classification algorithms. Further, we report on how well several existing machine learning methods perform for automated classification of NFRs into sub-categories such as usability, availability, or performance. Our study is performed on 625 requirements provided by the OpenScience tera-PROMISE repository. We found that our preprocessing improved the performance of an existing classification method. We further found significant differences in the performance of approaches such as Latent Dirichlet Allocation, Biterm Topic Modeling, or Naive Bayes for the sub-classification of NFRs.
SEJul 6, 2017
Let's hear it from RETTA: A Requirements Elicitation Tool for TrAffic management systemsMohammad Noaeen, Zahra Shakeri Hossein Abad, Behrouz Homayoun Far
The area of Traffic Management (TM) is characterized by uncertainty, complexity, and imprecision. The complexity of software systems in the TM domain which contributes to a more challenging Requirements Engineering (RE) job mainly stems from the diversity of stakeholders and complexity of requirements elicitation in this domain. This work brings an interactive solution for exploring functional and non-functional requirements of software-reliant systems in the area of traffic management. We prototyped the RETTA tool which leverages the wisdom of the crowd and combines it with machine learning approaches such as Natural Language Processing and Naive Bayes to help with the requirements elicitation and classification task in the TM domain. This bridges the gap among stakeholders from both areas of software development and transportation engineering. The RETTA prototype is mainly designed for requirements engineers and software developers in the area of TM and can be used on Android-based devices.
SEJul 6, 2017
A Visual Narrative Path from Switching to Resuming a Requirements Engineering TaskZahra Shakeri Hossein Abad, Alex Shymka, Jenny Le et al.
Requirements Engineering (RE) is closely tied to other development activities and is at the heart and foundation of every software development process. This makes RE the most data and communication-intensive activity compared to other development tasks. The highly demanding communication makes task switching and interruptions inevitable in RE activities. While task switching often allows us to perform tasks effectively, it imposes a cognitive load and can be detrimental to the primary task, particularly in complex tasks as the ones typical for RE activities. Visualization mechanisms enhanced with analytical methods and interaction techniques help software developers obtain a better cognitive understanding of the complexity of RE decisions, leading to timelier and higher quality decisions. In this paper, we propose to apply interactive visual analytics techniques for managing requirements decisions from various perspectives, including stakeholders communication, RE task switching, and interruptions. We propose a new layered visualization framework that supports the analytical reasoning process of task switching. This framework consists of both data analysis and visualization layers. The visual layers offer interactive knowledge visualization components for managing task interruption decisions at different stages of an interruption (i.e. before, during, and after). The analytical layers provide narrative knowledge about the consequences of task switching decisions and help requirements engineers to recall their reasoning process and decisions upon resuming a task. Moreover, we surveyed 53 software developers to test our visual prototype and to explore more required features for the visual and analytical layers of our framework.
SEJul 4, 2017
Task Interruptions in Requirements Engineering: Reality versus Perceptions!Zahra Shakeri Hossein Abad, Guenther Ruhe, Mike Bauer
Task switching and interruptions are a daily reality in software development projects: developers switch between Requirements Engineering (RE), coding, testing, daily meetings, and other tasks. Task switching may increase productivity through increased information flow and effective time management. However, it might also cause a cognitive load to reorient the primary task, which accounts for the decrease in developers' productivity and increases in errors. This cognitive load is even greater in cases of cognitively demanding tasks as the ones typical for RE activities. In this paper, to compare the reality of task switching in RE with the perception of developers, we conducted two studies: (i) a case study analysis on 5,076 recorded tasks of 19 developers and (ii) a survey of 25 developers. The results of our retrospective analysis show that in ALL of the cases that the disruptiveness of RE interruptions is statistically different from other software development tasks, RE related tasks are more vulnerable to interruptions compared to other task types. Moreover, we found that context switching, the priority of the interrupting task, and the interruption source and timing are key factors that impact RE interruptions. We also provided a set of RE task switching patterns along with recommendations for both practitioners and researchers. While the results of our retrospective analysis show that self-interruptions are more disruptive than external interruptions, developers have different perceptions about the disruptiveness of various sources of interruptions.