Alexander Felfernig

AI
h-index16
24papers
358citations
Novelty27%
AI Score49

24 Papers

54.0SEMay 28
Usability Analysis of Configurator User Interfaces with Multimodal Large Language Models

Sebastian Lubos, Alexander Felfernig, Damian Garber et al.

Configuration is a key technology for tailoring complex software systems, services, and products. A successful application of configurators not only depends on technical correctness, performance, and domain modeling but also on their usability. While general usability heuristics are widely used, configurator-specific criteria and tool support for systematic user interface (UI) analysis are limited. This paper explores the use of multimodal large language models (MLLMs) for scalable and semi-automated usability analysis of configurator UIs. We synthesize 18 configurator-specific usability criteria from the literature and apply these criteria in an MLLM-based analysis of 16 real-world configurators. Each criterion is assessed individually to generate severity ratings for usability issues and actionable improvement suggestions. A review of the results confirms that MLLMs can reliably identify configurator-specific usability issues and provide domain-aware improvement recommendations. Although human validation remains necessary, this approach has the potential to significantly reduce the required effort to analyze configurator usability.

AIApr 26, 2023
Conjunctive Query Based Constraint Solving For Feature Model Configuration

Alexander Felfernig, Viet-Man Le, Sebastian Lubos

Feature model configuration can be supported on the basis of various types of reasoning approaches. Examples thereof are SAT solving, constraint solving, and answer set programming (ASP). Using these approaches requires technical expertise of how to define and solve the underlying configuration problem. In this paper, we show how to apply conjunctive queries typically supported by today's relational database systems to solve constraint satisfaction problems (CSP) and -- more specifically -- feature model configuration tasks. This approach allows the application of a wide-spread database technology to solve configuration tasks and also allows for new algorithmic approaches when it comes to the identification and resolution of inconsistencies.

AIOct 4, 2023
Solving Multi-Configuration Problems: A Performance Analysis with Choco Solver

Benjamin Ritz, Alexander Felfernig, Viet-Man Le et al.

In many scenarios, configurators support the configuration of a solution that satisfies the preferences of a single user. The concept of \emph{multi-configuration} is based on the idea of configuring a set of configurations. Such a functionality is relevant in scenarios such as the configuration of personalized exams, the configuration of project teams, and the configuration of different trips for individual members of a tourist group (e.g., when visiting a specific city). In this paper, we exemplify the application of multi-configuration for generating individualized exams. We also provide a constraint solver performance analysis which helps to gain some insights into corresponding performance issues.

16.4CLMar 31
LLM Agents Predict Social Media Reactions but Do Not Outperform Text Classifiers: Benchmarking Simulation Accuracy Using 120K+ Personas of 1511 Humans

Ljubisa Bojic, Alexander Felfernig, Bojana Dinic et al.

Social media platforms mediate how billions form opinions and engage with public discourse. As autonomous AI agents increasingly participate in these spaces, understanding their behavioral fidelity becomes critical for platform governance and democratic resilience. Previous work demonstrates that LLM-powered agents can replicate aggregate survey responses, yet few studies test whether agents can predict specific individuals' reactions to specific content. This study benchmarks LLM-based agents' accuracy in predicting human social media reactions (like, dislike, comment, share, no reaction) across 120,000+ unique agent-persona combinations derived from 1,511 Serbian participants and 27 large language models. In Study 1, agents achieved 70.7% overall accuracy, with LLM choice producing a 13 percentage-point performance spread. Study 2 employed binary forced-choice (like/dislike) evaluation with chance-corrected metrics. Agents achieved Matthews Correlation Coefficient (MCC) of 0.29, indicating genuine predictive signal beyond chance. However, conventional text-based supervised classifiers using TF-IDF representations outperformed LLM agents (MCC of 0.36), suggesting predictive gains reflect semantic access rather than uniquely agentic reasoning. The genuine predictive validity of zero-shot persona-prompted agents warns against potential manipulation through easily deploying swarms of behaviorally distinct AI agents on social media, while simultaneously offering opportunities to use such agents in simulations for predicting polarization dynamics and informing AI policy. The advantage of using zero-shot agents is that they require no task-specific training, making their large-scale deployment easy across diverse contexts. Limitations include single-country sampling. Future research should explore multilingual testing and fine-tuning approaches.

49.2SEApr 28
Recommending Usability Improvements with Multimodal Large Language Models

Sebastian Lubos, Alexander Felfernig, Damian Garber et al.

Usability describes quality attributes of application user interfaces that determine how effectively users can interact with them. Traditional usability evaluation methods require considerable expertise and resources, which can be challenging, especially for small teams and organizations. Automating usability evaluation could make it more accessible and help to improve the user experience. The recent emergence of powerful multimodal large language models (MLLMs) has opened new opportunities for automating usability evaluation and recommendation of improvements. These models can process visual inputs such as images and videos alongside textual context, which enables the identification of usability issues and the generation of actionable suggestions to resolve these issues. In this paper, we present a novel automated approach that uses limited application context and screen recordings of user interactions as input to an MLLM. The model automatically identifies and describes usability issues based on Nielsens usability heuristics, and provides corresponding explanations and improvement recommendations. To reduce the developer effort of manual prioritization, the recommendations are ranked by severity. The quality and practical usefulness of the generated recommendations were evaluated based on a user study that involved software engineers as participants. The evaluation focused on the highest-ranked suggestions provided by the model. The results demonstrate the potential of our approach to provide low-effort usability improvement recommendations. This makes it a promising complement to traditional evaluation methods, especially in settings with limited access to usability experts. In this sense, the approach serves as a basis for future integration into development tools to enable automated usability evaluation within software engineering workflows.

SEFeb 17, 2021Code
Towards Utility-based Prioritization of Requirements in Open Source Environments

Alexander Felfernig, Martin Stettinger, Müslüm Atas et al.

Requirements Engineering in open source projects such as Eclipse faces the challenge of having to prioritize requirements for individual contributors in a more or less unobtrusive fashion. In contrast to conventional industrial software development projects, contributors in open source platforms can decide on their own which requirements to implement next. In this context, the main role of prioritization is to support contributors in figuring out the most relevant and interesting requirements to be implemented next and thus avoid time-consuming and inefficient search processes. In this paper, we show how utility-based prioritization approaches can be used to support contributors in conventional as well as in open source Requirements Engineering scenarios. As an example of an open source environment, we use Bugzilla. In this context, we also show how dependencies can be taken into account in utility-based prioritization processes.

IRDec 4, 2024
Recommender Systems for Sustainability: Overview and Research Issues

Alexander Felfernig, Manfred Wundara, Thi Ngoc Trang Tran et al.

Sustainability development goals (SDGs) are regarded as a universal call to action with the overall objectives of planet protection, ending of poverty, and ensuring peace and prosperity for all people. In order to achieve these objectives, different AI technologies play a major role. Specifically, recommender systems can provide support for organizations and individuals to achieve the defined goals. Recommender systems integrate AI technologies such as machine learning, explainable AI (XAI), case-based reasoning, and constraint solving in order to find and explain user-relevant alternatives from a potentially large set of options. In this article, we summarize the state of the art in applying recommender systems to support the achievement of sustainability development goals. In this context, we discuss open issues for future research.

48.9SEApr 22
Early-Stage Product Line Validation Using LLMs: A Study on Semi-Formal Blueprint Analysis

Viet-Man Le, Thi Ngoc Trang Tran, Sebastian Lubos et al.

We study whether Large Language Models (LLMs) can perform feature model analysis operations (AOs) directly on semi-formal textual blueprints, i.e., concise constrained-language descriptions of feature hierarchies and constraints, enabling early validation in Software Product Line scoping. Using 12 state-of-the-art LLMs and 16 standard AOs, we compare their outputs against the solver-based oracle FLAMA. Results show that reasoning-optimized models (e.g., Grok 4 Fast Reasoning, Gemini 2.5 Pro) achieve 88-89% average accuracy across all evaluated blueprints and operations, approaching solver correctness. We identify systematic errors in structural parsing and constraint reasoning, and highlight accuracy-cost trade-offs that inform model selection. These findings position LLMs as lightweight assistants for early variability validation.

IRDec 6, 2023
Sports Recommender Systems: Overview and Research Issues

Alexander Felfernig, Manfred Wundara, Thi Ngoc Trang Tran et al.

Sports recommender systems receive an increasing attention due to their potential of fostering healthy living, improving personal well-being, and increasing performances in sport. These systems support people in sports, for example, by the recommendation of healthy and performance boosting food items, the recommendation of training practices, talent and team recommendation, and the recommendation of specific tactics in competitions. With applications in the virtual world, for example, the recommendation of maps or opponents in e-sports, these systems already transcend conventional sports scenarios where physical presence is needed. On the basis of different working examples, we present an overview of sports recommender systems applications and techniques. Overall, we analyze the related state-of-the-art and discuss open research issues.

SEAug 22, 2025
Towards Recommending Usability Improvements with Multimodal Large Language Models

Sebastian Lubos, Alexander Felfernig, Gerhard Leitner et al.

Usability describes a set of essential quality attributes of user interfaces (UI) that influence human-computer interaction. Common evaluation methods, such as usability testing and inspection, are effective but resource-intensive and require expert involvement. This makes them less accessible for smaller organizations. Recent advances in multimodal LLMs offer promising opportunities to automate usability evaluation processes partly by analyzing textual, visual, and structural aspects of software interfaces. To investigate this possibility, we formulate usability evaluation as a recommendation task, where multimodal LLMs rank usability issues by severity. We conducted an initial proof-of-concept study to compare LLM-generated usability improvement recommendations with usability expert assessments. Our findings indicate the potential of LLMs to enable faster and more cost-effective usability evaluation, which makes it a practical alternative in contexts with limited expert resources.

IRJul 25, 2025
Towards LLM-Enhanced Group Recommender Systems

Sebastian Lubos, Alexander Felfernig, Thi Ngoc Trang Tran et al.

In contrast to single-user recommender systems, group recommender systems are designed to generate and explain recommendations for groups. This group-oriented setting introduces additional complexities, as several factors - absent in individual contexts - must be addressed. These include understanding group dynamics (e.g., social dependencies within the group), defining effective decision-making processes, ensuring that recommendations are suitable for all group members, and providing group-level explanations as well as explanations for individual users. In this paper, we analyze in which way large language models (LLMs) can support these aspects and help to increase the overall decision support quality and applicability of group recommender systems.

AIMay 11, 2023
FastDiagP: An Algorithm for Parallelized Direct Diagnosis

Viet-Man Le, Cristian Vidal Silva, Alexander Felfernig et al.

Constraint-based applications attempt to identify a solution that meets all defined user requirements. If the requirements are inconsistent with the underlying constraint set, algorithms that compute diagnoses for inconsistent constraints should be implemented to help users resolve the "no solution could be found" dilemma. FastDiag is a typical direct diagnosis algorithm that supports diagnosis calculation without predetermining conflicts. However, this approach faces runtime performance issues, especially when analyzing complex and large-scale knowledge bases. In this paper, we propose a novel algorithm, so-called FastDiagP, which is based on the idea of speculative programming. This algorithm extends FastDiag by integrating a parallelization mechanism that anticipates and pre-calculates consistency checks requested by FastDiag. This mechanism helps to provide consistency checks with fast answers and boosts the algorithm's runtime performance. The performance improvements of our proposed algorithm have been shown through empirical results using the Linux-2.6.3.33 configuration knowledge base.

AISep 20, 2021
Configuring Multiple Instances with Multi-Configuration

Alexander Felfernig, Andrei Popescu, Mathias Uta et al.

Configuration is a successful application area of Artificial Intelligence. In the majority of the cases, configuration systems focus on configuring one solution (configuration) that satisfies the preferences of a single user or a group of users. In this paper, we introduce a new configuration approach - multi-configuration - that focuses on scenarios where the outcome of a configuration process is a set of configurations. Example applications thereof are the configuration of personalized exams for individual students, the configuration of project teams, reviewer-to-paper assignment, and hotel room assignments including individualized city trips for tourist groups. For multi-configuration scenarios, we exemplify a constraint satisfaction problem representation in the context of configuring exams. The paper is concluded with a discussion of open issues for future work.

SEAug 2, 2021
AI Techniques for Software Requirements Prioritization

Alexander Felfernig

Aspects such as limited resources, frequently changing market demands, and different technical restrictions regarding the implementation of software requirements (features) often demand for the prioritization of requirements. The task of prioritization is the ranking and selection of requirements that should be included in future software releases. In this context, an intelligent prioritization decision support is extremely important. The prioritization approaches discussed in this paper are based on different Artificial Intelligence (AI) techniques that can help to improve the overall quality of requirements prioritization processes

IRFeb 24, 2021
An Overview of Direct Diagnosis and Repair Techniques in the WeeVis Recommendation Environment

Alexander Felfernig, Stefan Reiterer, Martin Stettinger et al.

Constraint-based recommenders support users in the identification of items (products) fitting their wishes and needs. Example domains are financial services and electronic equipment. In this paper we show how divide-and-conquer based (direct) diagnosis algorithms (no conflict detection is needed) can be exploited in constraint-based recommendation scenarios. In this context, we provide an overview of the MediaWiki-based recommendation environment WeeVis.

AIFeb 24, 2021
CoreDiag: Eliminating Redundancy in Constraint Sets

Alexander Felfernig, Christoph Zehentner, Paul Blazek

Constraint-based environments such as configuration systems, recommender systems, and scheduling systems support users in different decision making scenarios. These environments exploit a knowledge base for determining solutions of interest for the user. The development and maintenance of such knowledge bases is an extremely time-consuming and error-prone task. Users often specify constraints which do not reflect the real-world. For example, redundant constraints are specified which often increase both, the effort for calculating a solution and efforts related to knowledge base development and maintenance. In this paper we present a new algorithm (CoreDiag) which can be exploited for the determination of minimal cores (minimal non-redundant constraint sets). The algorithm is especially useful for distributed knowledge engineering scenarios where the degree of redundancy can become high. In order to show the applicability of our approach, we present an empirical study conducted with commercial configuration knowledge bases.

AIFeb 19, 2021
Anytime Diagnosis for Reconfiguration

Alexander Felfernig, Rouven Walter, Jose A. Galindo et al.

Many domains require scalable algorithms that help to determine diagnoses efficiently and often within predefined time limits. Anytime diagnosis is able to determine solutions in such a way and thus is especially useful in real-time scenarios such as production scheduling, robot control, and communication networks management where diagnosis and corresponding reconfiguration capabilities play a major role. Anytime diagnosis in many cases comes along with a trade-off between diagnosis quality and the efficiency of diagnostic reasoning. In this paper we introduce and analyze FlexDiag which is an anytime direct diagnosis approach. We evaluate the algorithm with regard to performance and diagnosis quality using a configuration benchmark from the domain of feature models and an industrial configuration knowledge base from the automotive domain. Results show that FlexDiag helps to significantly increase the performance of direct diagnosis search with corresponding quality tradeoffs in terms of minimality and accuracy.

AIFeb 17, 2021
An Efficient Diagnosis Algorithm for Inconsistent Constraint Sets

Alexander Felfernig, Monika Schubert, Christoph Zehentner

Constraint sets can become inconsistent in different contexts. For example, during a configuration session the set of customer requirements can become inconsistent with the configuration knowledge base. Another example is the engineering phase of a configuration knowledge base where the underlying constraints can become inconsistent with a set of test cases. In such situations we are in the need of techniques that support the identification of minimal sets of faulty constraints that have to be deleted in order to restore consistency. In this paper we introduce a divide-and-conquer based diagnosis algorithm (FastDiag) which identifies minimal sets of faulty constraints in an over-constrained problem. This algorithm is specifically applicable in scenarios where the efficient identification of leading (preferred) diagnoses is crucial. We compare the performance of FastDiag with the conflict-directed calculation of hitting sets and present an in-depth performance analysis that shows the advantages of our approach.

IRFeb 16, 2021
Recommender Systems for Configuration Knowledge Engineering

Alexander Felfernig, Stefan Reiterer, Martin Stettinger et al.

The knowledge engineering bottleneck is still a major challenge in configurator projects. In this paper we show how recommender systems can support knowledge base development and maintenance processes. We discuss a couple of scenarios for the application of recommender systems in knowledge engineering and report the results of empirical studies which show the importance of user-centered configuration knowledge organization.

IRFeb 15, 2021
KnowledgeCheckR: Intelligent Techniques for Counteracting Forgetting

Martin Stettinger, Trang Tran, Ingo Pribik et al.

Existing e-learning environments primarily focus on the aspect of providing intuitive learning contents and to recommend learning units in a personalized fashion. The major focus of the KnowledgeCheckR environment is to take into account forgetting processes which immediately start after a learning unit has been completed. In this context, techniques are needed that are able to predict which learning units are the most relevant ones to be repeated in future learning sessions. In this paper, we provide an overview of the recommendation approaches integrated in KnowledgeCheckR. Examples thereof are utility-based recommendation that helps to identify learning contents to be repeated in the future, collaborative filtering approaches that help to implement session-based recommendation, and content-based recommendation that supports intelligent question answering. In order to show the applicability of the presented techniques, we provide an overview of the results of empirical studies that have been conducted in real-world scenarios.

AIFeb 15, 2021
Consistency-based Merging of Variability Models

Mathias Uta, Alexander Felfernig, Gottfried Schenner et al.

Globally operating enterprises selling large and complex products and services often have to deal with situations where variability models are locally developed to take into account the requirements of local markets. For example, cars sold on the U.S. market are represented by variability models in some or many aspects different from European ones. In order to support global variability management processes, variability models and the underlying knowledge bases often need to be integrated. This is a challenging task since an integrated knowledge base should not produce results which are different from those produced by the individual knowledge bases. In this paper, we introduce an approach to variability model integration that is based on the concepts of contextual modeling and conflict detection. We present the underlying concepts and the results of a corresponding performance analysis.

IRFeb 12, 2021
An Overview of Recommender Systems and Machine Learning in Feature Modeling and Configuration

Alexander Felfernig, Viet-Man Le, Andrei Popescu et al.

Recommender systems support decisions in various domains ranging from simple items such as books and movies to more complex items such as financial services, telecommunication equipment, and software systems. In this context, recommendations are determined, for example, on the basis of analyzing the preferences of similar users. In contrast to simple items which can be enumerated in an item catalog, complex items have to be represented on the basis of variability models (e.g., feature models) since a complete enumeration of all possible configurations is infeasible and would trigger significant performance issues. In this paper, we give an overview of a potential new line of research which is related to the application of recommender systems and machine learning techniques in feature modeling and configuration. In this context, we give examples of the application of recommender systems and machine learning and discuss future research issues.

SEFeb 11, 2021
DirectDebug: Automated Testing and Debugging of Feature Models

Viet-Man Le, Alexander Felfernig, Mathias Uta et al.

Variability models (e.g., feature models) are a common way for the representation of variabilities and commonalities of software artifacts. Such models can be translated to a logical representation and thus allow different operations for quality assurance and other types of model property analysis. Specifically, complex and often large-scale feature models can become faulty, i.e., do not represent the expected variability properties of the underlying software artifact. In this paper, we introduce DirectDebug which is a direct diagnosis approach to the automated testing and debugging of variability models. The algorithm helps software engineers by supporting an automated identification of faulty constraints responsible for an unintended behavior of a variability model. This approach can significantly decrease development and maintenance efforts for such models.

SEAug 7, 2018
Needs and Challenges for a Platform to Support Large-scale Requirements Engineering. A Multiple Case Study

Davide Fucci, Cristina Palomares, Dolors Costal et al.

Background: Requirement engineering is often considered a critical activity in system development projects. The increasing complexity of software, as well as number and heterogeneity of stakeholders, motivate the development of methods and tools for improving large-scale requirement engineering. Aims: The empirical study presented in this paper aims to identify and understand the characteristics and challenges of a platform, as desired by experts, to support requirement engineering for individual stakeholders, based on the current pain-points of their organizations when dealing with a large number requirements. Method: We conducted a multiple case study with three companies in different domains. We collected data through ten semi-structured interviews with experts from these companies. Results: The main pain-point for stakeholders is handling the vast amount of data from different sources. The foreseen platform should leverage such data to manage changes in requirements according to customers' and users' preferences. It should also offer stakeholders an estimation of how long a requirements engineering task will take to complete, along with an easier requirements dependency identification and requirements reuse strategy. Conclusions: The findings provide empirical evidence about how practitioners wish to improve their requirement engineering processes and tools. The insights are a starting point for in-depth investigations into the problems and solutions presented. Practitioners can use the results to improve existing or design new practices and tools.