92.6SEMay 31
Bridging Requirements and Architecture: Multi-Agent Orchestration with External Knowledge and Hierarchical MemoryRuiyin Li, Yiran Zhang, Xiyu Zhou et al.
Software architecture design is a critical yet inherently complex and knowledge-intensive phase that requires balancing competing quality attributes and adapting to evolving requirements. Traditionally, this process has been time-consuming, labor-intensive, and heavily reliant on architects, often resulting in limited exploration of alternative architectural decompositions and styles, especially under the pressures of agile development. While LLM-based agents have shown promising performance across various software engineering tasks, their application to architecture design remains relatively scarce and requires systematic exploration. To address these challenges, we proposed MAAD (Multi-Agent Architecture Design), a knowledge-driven framework that orchestrates four specialized agents (i.e., Analyst, Modeler, Designer and Evaluator) to autonomously and collaboratively transform requirements specifications into comprehensive, multi-view architectural blueprints with quality attribute assessments. MAAD incorporates RAG to inject recognized architectural standards and patterns into the workflow and leverages a hierarchical memory mechanism that captures design history for iterative refinement. We evaluated MAAD through comparative experiments against MetaGPT, using quantitative architecture-level metrics across 10 case studies and qualitative feedback from industry architects on 10 real-world specifications. Results show that MAAD generates more complete, modular, and traceable architectures than the baseline, and its dedicated Evaluator agent autonomously produces structured quality evaluation reports that significantly reduce manual validation efforts. Furthermore, we found that the quality of the generated architecture heavily depends on the underlying LLM's reasoning capacity, with GPT-5.2 and Qwen3.5 outperforming other models across most evaluation settings.
ROJan 13Code
Generalizable Geometric Prior and Recurrent Spiking Feature Learning for Humanoid Robot ManipulationXuetao Li, Wenke Huang, Mang Ye et al.
Humanoid robot manipulation is a crucial research area for executing diverse human-level tasks, involving high-level semantic reasoning and low-level action generation. However, precise scene understanding and sample-efficient learning from human demonstrations remain critical challenges, severely hindering the applicability and generalizability of existing frameworks. This paper presents a novel RGMP-S, Recurrent Geometric-prior Multimodal Policy with Spiking features, facilitating both high-level skill reasoning and data-efficient motion synthesis. To ground high-level reasoning in physical reality, we leverage lightweight 2D geometric inductive biases to enable precise 3D scene understanding within the vision-language model. Specifically, we construct a Long-horizon Geometric Prior Skill Selector that effectively aligns the semantic instructions with spatial constraints, ultimately achieving robust generalization in unseen environments. For the data efficiency issue in robotic action generation, we introduce a Recursive Adaptive Spiking Network. We parameterize robot-object interactions via recursive spiking for spatiotemporal consistency, fully distilling long-horizon dynamic features while mitigating the overfitting issue in sparse demonstration scenarios. Extensive experiments are conducted across the Maniskill simulation benchmark and three heterogeneous real-world robotic systems, encompassing a custom-developed humanoid, a desktop manipulator, and a commercial robotic platform. Empirical results substantiate the superiority of our method over state-of-the-art baselines and validate the efficacy of the proposed modules in diverse generalization scenarios. To facilitate reproducibility, the source code and video demonstrations are publicly available at https://github.com/xtli12/RGMP-S.git.
SENov 10, 2018Code
Nopol: Automatic Repair of Conditional Statement Bugs in Java ProgramsJifeng Xuan, Matias Martinez, Favio Demarco et al.
We propose NOPOL, an approach to automatic repair of buggy conditional statements (i.e., if-then-else statements). This approach takes a buggy program as well as a test suite as input and generates a patch with a conditional expression as output. The test suite is required to contain passing test cases to model the expected behavior of the program and at least one failing test case that reveals the bug to be repaired. The process of NOPOL consists of three major phases. First, NOPOL employs angelic fix localization to identify expected values of a condition during the test execution. Second, runtime trace collection is used to collect variables and their actual values, including primitive data types and objected-oriented features (e.g., nullness checks), to serve as building blocks for patch generation. Third, NOPOL encodes these collected data into an instance of a Satisfiability Modulo Theory (SMT) problem, then a feasible solution to the SMT instance is translated back into a code patch. We evaluate NOPOL on 22 real-world bugs (16 bugs with buggy IF conditions and 6 bugs with missing preconditions) on two large open-source projects, namely Apache Commons Math and Apache Commons Lang. Empirical analysis on these bugs shows that our approach can effectively fix bugs with buggy IF conditions and missing preconditions. We illustrate the capabilities and limitations of NOPOL using case studies of real bug fixes.
SEMar 19, 2018Code
Automated Localization for Unreproducible BuildsZhilei Ren, He Jiang, Jifeng Xuan et al.
Reproducibility is the ability of recreating identical binaries under pre-defined build environments. Due to the need of quality assurance and the benefit of better detecting attacks against build environments, the practice of reproducible builds has gained popularity in many open-source software repositories such as Debian and Bitcoin. However, identifying the unreproducible issues remains a labour intensive and time consuming challenge, because of the lacking of information to guide the search and the diversity of the causes that may lead to the unreproducible binaries. In this paper we propose an automated framework called RepLoc to localize the problematic files for unreproducible builds. RepLoc features a query augmentation component that utilizes the information extracted from the build logs, and a heuristic rule-based filtering component that narrows the search scope. By integrating the two components with a weighted file ranking module, RepLoc is able to automatically produce a ranked list of files that are helpful in locating the problematic files for the unreproducible builds. We have implemented a prototype and conducted extensive experiments over 671 real-world unreproducible Debian packages in four different categories. By considering the topmost ranked file only, RepLoc achieves an accuracy rate of 47.09%. If we expand our examination to the top ten ranked files in the list produced by RepLoc, the accuracy rate becomes 79.28%. Considering that there are hundreds of source code, scripts, Makefiles, etc., in a package, RepLoc significantly reduces the scope of localizing problematic files. Moreover, with the help of RepLoc, we successfully identified and fixed six new unreproducible packages from Debian and Guix.
SEApr 16, 2017Code
Towards Effective Bug Triage with Towards Effective Bug Triage with Software Data Reduction TechniquesJifeng Xuan, He Jiang, Yan Hu et al.
Software companies spend over 45 percent of cost in dealing with software bugs. An inevitable step of fixing bugs is bug triage, which aims to correctly assign a developer to a new bug. To decrease the time cost in manual work, text classification techniques are applied to conduct automatic bug triage. In this paper, we address the problem of data reduction for bug triage, i.e., how to reduce the scale and improve the quality of bug data. We combine instance selection with feature selection to simultaneously reduce data scale on the bug dimension and the word dimension. To determine the order of applying instance selection and feature selection, we extract attributes from historical bug data sets and build a predictive model for a new bug data set. We empirically investigate the performance of data reduction on totally 600,000 bug reports of two large open source projects, namely Eclipse and Mozilla. The results show that our data reduction can effectively reduce the data scale and improve the accuracy of bug triage. Our work provides an approach to leveraging techniques on data processing to form reduced and high-quality bug data in software development and maintenance.
SESep 10, 2014Code
Test Case Purification for Improving Fault LocalizationJifeng Xuan, Martin Monperrus
Finding and fixing bugs are time-consuming activities in software development. Spectrum-based fault localization aims to identify the faulty position in source code based on the execution trace of test cases. Failing test cases and their assertions form test oracles for the failing behavior of the system under analysis. In this paper, we propose a novel concept of spectrum driven test case purification for improving fault localization. The goal of test case purification is to separate existing test cases into small fractions (called purified test cases) and to enhance the test oracles to further localize faults. Combining with an original fault localization technique (e.g., Tarantula), test case purification results in better ranking the program statements. Our experiments on 1800 faults in six open-source Java programs show that test case purification can effectively improve existing fault localization techniques.
89.1SEMay 7
SiblingRepair: Sibling-Based Multi-Hunk Repair with Large Language ModelsXinyu Liu, Jiayu Ren, Yusen Wang et al.
Developers often make similar mistakes across code locations implementing related functionalities. These locations, called siblings, share similar issues and require similar fixes. Accurately identifying siblings and consistently repairing them are crucial for automated program repair. Hercules is a SOTA technique designed for sibling repair. However, it is limited by strong assumptions about sibling locations and commit-history availability, rigid AST-based sibling matching, and inflexible template-based patch generation. To address these limitations, we present SiblingRepair, a new LLM-based multi-hunk APR technique specialized for sibling repair. Starting from a suspicious location identified by spectrum-based fault localization, SiblingRepair searches for semantically related sibling candidates using token- and embedding-based code matching, without restricting discovery to failing-test coverage or commit history. It then uses an LLM to identify failure-relevant siblings and generate consistent patches through two complementary strategies: simultaneous repair, which jointly repairs siblings, and iterative repair, which progressively analyzes candidates for patch construction. SiblingRepair further preserves promising patches generated from earlier suspicious locations and combines them into generalized multi-hunk patches. We evaluate SiblingRepair on the Defects4J and GHRB benchmarks. The results show that SiblingRepair substantially outperforms SOTA multi-hunk repair techniques including Hercules. Our evaluation further demonstrates its repair efficiency, the effectiveness of its sibling detection and repair components, and limited impact of the LLM data leakage on the results. Overall, SiblingRepair advances automated sibling and general multi-hunk repair.
SEJul 28, 2025
MAAD: Automate Software Architecture Design through Knowledge-Driven Multi-Agent CollaborationRuiyin Li, Yiran Zhang, Xiyu Zhou et al.
Software architecture design is a critical, yet inherently complex and knowledge-intensive phase of software development. It requires deep domain expertise, development experience, architectural knowledge, careful trade-offs among competing quality attributes, and the ability to adapt to evolving requirements. Traditionally, this process is time-consuming and labor-intensive, and relies heavily on architects, often resulting in limited design alternatives, especially under the pressures of agile development. While Large Language Model (LLM)-based agents have shown promising performance across various SE tasks, their application to architecture design remains relatively scarce and requires more exploration, particularly in light of diverse domain knowledge and complex decision-making. To address the challenges, we proposed MAAD (Multi-Agent Architecture Design), an automated framework that employs a knowledge-driven Multi-Agent System (MAS) for architecture design. MAAD orchestrates four specialized agents (i.e., Analyst, Modeler, Designer and Evaluator) to collaboratively interpret requirements specifications and produce architectural blueprints enriched with quality attributes-based evaluation reports. We then evaluated MAAD through a case study and comparative experiments against MetaGPT, a state-of-the-art MAS baseline. Our results show that MAAD's superiority lies in generating comprehensive architectural components and delivering insightful and structured architecture evaluation reports. Feedback from industrial architects across 11 requirements specifications further reinforces MAAD's practical usability. We finally explored the performance of the MAAD framework with three LLMs (GPT-4o, DeepSeek-R1, and Llama 3.3) and found that GPT-4o exhibits better performance in producing architecture design, emphasizing the importance of LLM selection in MAS-driven architecture design.
SENov 4, 2018
Automatic Repair of Real Bugs in Java: A Large-Scale Experiment on the Defects4J DatasetMatias Martinez, Thomas Durieux, Romain Sommerard et al.
Defects4J is a large, peer-reviewed, structured dataset of real-world Java bugs. Each bug in Defects4J comes with a test suite and at least one failing test case that triggers the bug. In this paper, we report on an experiment to explore the effectiveness of automatic test-suite based repair on Defects4J. The result of our experiment shows that the considered state-of-the-art repair methods can generate patches for 47 out of 224 bugs. However, those patches are only test-suite adequate, which means that they pass the test suite and may potentially be incorrect beyond the test-suite satisfaction correctness criterion. We have manually analyzed 84 different patches to assess their real correctness. In total, 9 real Java bugs can be correctly repaired with test-suite based repair. This analysis shows that test-suite based repair suffers from under-specified bugs, for which trivial or incorrect patches still pass the test suite. With respect to practical applicability, it takes on average 14.8 minutes to find a patch. The experiment was done on a scientific grid, totaling 17.6 days of computation time. All the repair systems and experimental results are publicly available on Github in order to facilitate future research on automatic repair.
SESep 29, 2018
Towards Better Summarizing Bug Reports with Crowdsourcing Elicited AttributesHe Jiang, Xiaochen Li, Zhilei Ren et al.
Recent years have witnessed the growing demands for resolving numerous bug reports in software maintenance. Aiming to reduce the time testers/developers take in perusing bug reports, the task of bug report summarization has attracted a lot of research efforts in the literature. However, no systematic analysis has been conducted on attribute construction which heavily impacts the performance of supervised algorithms for bug report summarization. In this study, we first conduct a survey to reveal the existing methods for attribute construction in mining software repositories. Then, we propose a new method named Crowd-Attribute to infer new effective attributes from the crowdgenerated data in crowdsourcing and develop a new tool named Crowdsourcing Software Engineering Platform to facilitate this method. With Crowd-Attribute, we successfully construct 11 new attributes and propose a new supervised algorithm named Logistic Regression with Crowdsourced Attributes (LRCA). To evaluate the effectiveness of LRCA, we build a series of large scale data sets with 105,177 bug reports. Experiments over both the public data set SDS with 36 manually annotated bug reports and new large-scale data sets demonstrate that LRCA can consistently outperform the state-of-the-art algorithms for bug report summarization.
NEApr 16, 2017
A Hybrid ACO Algorithm for the Next Release ProblemHe Jiang, Jingyuan Zhang, Jifeng Xuan et al.
In this paper, we propose a Hybrid Ant Colony Optimization algorithm (HACO) for Next Release Problem (NRP). NRP, a NP-hard problem in requirement engineering, is to balance customer requests, resource constraints, and requirement dependencies by requirement selection. Inspired by the successes of Ant Colony Optimization algorithms (ACO) for solving NP-hard problems, we design our HACO to approximately solve NRP. Similar to traditional ACO algorithms, multiple artificial ants are employed to construct new solutions. During the solution construction phase, both pheromone trails and neighborhood information will be taken to determine the choices of every ant. In addition, a local search (first found hill climbing) is incorporated into HACO to improve the solution quality. Extensively wide experiments on typical NRP test instances show that HACO outperforms the existing algorithms (GRASP and simulated annealing) in terms of both solution uality and running time.
AIApr 16, 2017
Approximating the Backbone in the Weighted Maximum Satisfiability ProblemHe Jiang, Jifeng Xuan, Yan Hu
The weighted Maximum Satisfiability problem (weighted MAX-SAT) is a NP-hard problem with numerous applications arising in artificial intelligence. As an efficient tool for heuristic design, the backbone has been applied to heuristics design for many NP-hard problems. In this paper, we investigated the computational complexity for retrieving the backbone in weighted MAX-SAT and developed a new algorithm for solving this problem. We showed that it is intractable to retrieve the full backbone under the assumption that . Moreover, it is intractable to retrieve a fixed fraction of the backbone as well. And then we presented a backbone guided local search (BGLS) with Walksat operator for weighted MAX-SAT. BGLS consists of two phases: the first phase samples the backbone information from local optima and the backbone phase conducts local search under the guideline of backbone. Extensive experimental results on the benchmark showed that BGLS outperforms the existing heuristics in both solution quality and runtime.
SEApr 16, 2017
Approximate Backbone Based Multilevel Algorithm for Next Release ProblemHe Jiang, Jifeng Xuan, Zhilei Ren
The next release problem (NRP) aims to effectively select software requirements in order to acquire maximum customer profits. As an NP-hard problem in software requirement engineering, NRP lacks efficient approximate algorithms for large scale instances. The backbone is a new tool for tackling large scale NP-hard problems in recent years. In this paper, we employ the backbone to design high performance approximate algorithms for large scale NRP instances. Firstly we show that it is NP-hard to obtain the backbone of NRP. Then, we illustrate by fitness landscape analysis that the backbone can be well approximated by the shared common parts of local optimal solutions. Therefore, we propose an approximate backbone based multilevel algorithm (ABMA) to solve large scale NRP instances. This algorithm iteratively explores the search spaces by multilevel reductions and refinements. Experimental results demonstrate that ABMA outperforms existing algorithms on large instances in terms of solution quality and running time.
SEApr 16, 2017
A Random Walk Based Algorithm for Structural Test Case GenerationJifeng Xuan, He Jiang, Zhilei Ren et al.
Structural testing is a significant and expensive process in software development. By converting test data generation into an optimization problem, search-based software testing is one of the key technologies of automated test case generation. Motivated by the success of random walk in solving the satisfiability problem (SAT), we proposed a random walk based algorithm (WalkTest) to solve structural test case generation problem. WalkTest provides a framework, which iteratively calls random walk operator to search the optimal solutions. In order to improve search efficiency, we sorted the test goals with the costs of solutions completely instead of traditional dependence analysis from control flow graph. Experimental results on the condition-decision coverage demonstrated that WalkTest achieves better performance than existing algorithms (random test and tabu search) in terms of running time and coverage rate.
SEApr 16, 2017
Automatic Bug Triage using Semi-Supervised Text ClassificationJifeng Xuan, He Jiang, Zhilei Ren et al.
In this paper, we propose a semi-supervised text classification approach for bug triage to avoid the deficiency of labeled bug reports in existing supervised approaches. This new approach combines naive Bayes classifier and expectation-maximization to take advantage of both labeled and unlabeled bug reports. This approach trains a classifier with a fraction of labeled bug reports. Then the approach iteratively labels numerous unlabeled bug reports and trains a new classifier with labels of all the bug reports. We also employ a weighted recommendation list to boost the performance by imposing the weights of multiple developers in training the classifier. Experimental results on bug reports of Eclipse show that our new approach outperforms existing supervised approaches in terms of classification accuracy.
SEApr 16, 2017
Solving the Large Scale Next Release Problem with a Backbone Based Multilevel AlgorithmJifeng Xuan, He Jiang, Zhilei Ren et al.
The Next Release Problem (NRP) aims to optimize customer profits and requirements selection for the software releases. The research on the NRP is restricted by the growing scale of requirements. In this paper, we propose a Backbone based Multilevel Algorithm (BMA) to address the large scale NRP. In contrast to direct solving approaches, BMA employs multilevel reductions to downgrade the problem scale and multilevel refinements to construct the final optimal set of customers. In both reductions and refinements, the backbone is built to fix the common part of the optimal customers. Since it is intractable to extract the backbone in practice, the approximate backbone is employed for the instance reduction while the soft backbone is proposed to augment the backbone application. In the experiments, to cope with the lack of open large requirements databases, we propose a method to extract instances from open bug repositories. Experimental results on 15 classic instances and 24 realistic instances demonstrate that BMA can achieve better solutions on the large scale NRP instances than direct solving approaches. Our work provides a reduction approach for solving large scale problems in search based requirements engineering.
SEApr 16, 2017
Debt-Prone Bugs: Technical Debt in Software MaintenanceJifeng Xuan, Yan Hu, He Jiang
Fixing bugs is an important phase in software development and maintenance. In practice, the process of bug fixing may conflict with the release schedule. Such confliction leads to a trade-off between software quality and release schedule, which is known as the technical debt metaphor. In this article, we propose the concept of debt-prone bugs to model the technical debt in software maintenance. We identify three types of debt-prone bugs, namely tag bugs, reopened bugs, and duplicate bugs. A case study on Mozilla is conducted to examine the impact of debt-prone bugs in software products. We investigate the correlation between debt-prone bugs and the product quality. For a product under development, we build prediction models based on historical products to predict the time cost of fixing bugs. The result shows that identifying debt-prone bugs can assist in monitoring and improving software quality.
SEApr 16, 2017
Developer Prioritization in Bug RepositoriesJifeng Xuan, He Jiang, Zhilei Ren et al.
Developers build all the software artifacts in development. Existing work has studied the social behavior in software repositories. In one of the most important software repositories, a bug repository, developers create and update bug reports to support software development and maintenance. However, no prior work has considered the priorities of developers in bug repositories. In this paper, we address the problem of the developer prioritization, which aims to rank the contributions of developers. We mainly explore two aspects, namely modeling the developer prioritization in a bug repository and assisting predictive tasks with our model. First, we model how to assign the priorities of developers based on a social network technique. Three problems are investigated, including the developer rankings in products, the evolution over time, and the tolerance of noisy comments. Second, we consider leveraging the developer prioritization to improve three predicted tasks in bug repositories, i.e., bug triage, severity identification, and reopened bug prediction. We empirically investigate the performance of our model and its applications in bug repositories of Eclipse and Mozilla. The results indicate that the developer prioritization can provide the knowledge of developer priorities to assist software tasks, especially the task of bug triage.
SEMar 13, 2017
Towards Training Set Reduction for Bug TriageWeiqin Zou, Yan Hu, Jifeng Xuan et al.
Bug triage is an important step in the process of bug fixing. The goal of bug triage is to assign a new-coming bug to the correct potential developer. The existing bug triage approaches are based on machine learning algorithms, which build classifiers from the training sets of bug reports. In practice, these approaches suffer from the large-scale and low-quality training sets. In this paper, we propose the training set reduction with both feature selection and instance selection techniques for bug triage. We combine feature selection with instance selection to improve the accuracy of bug triage. The feature selection algorithm, instance selection algorithm Iterative Case Filter, and their combinations are studied in this paper. We evaluate the training set reduction on the bug data of Eclipse. For the training set, 70% words and 50% bug reports are removed after the training set reduction. The experimental results show that the new and small training sets can provide better accuracy than the original one.
SEMar 2, 2017
What Causes My Test Alarm? Automatic Cause Analysis for Test Alarms in System and Integration TestingHe Jiang, Xiaochen Li, Zijiang Yang et al.
Driven by new software development processes and testing in clouds, system and integration testing nowadays tends to produce enormous number of alarms. Such test alarms lay an almost unbearable burden on software testing engineers who have to manually analyze the causes of these alarms. The causes are critical because they decide which stakeholders are responsible to fix the bugs detected during the testing. In this paper, we present a novel approach that aims to relieve the burden by automating the procedure. Our approach, called Cause Analysis Model, exploits information retrieval techniques to efficiently infer test alarm causes based on test logs. We have developed a prototype and evaluated our tool on two industrial datasets with more than 14,000 test alarms. Experiments on the two datasets show that our tool achieves an accuracy of 58.3% and 65.8%, respectively, which outperforms the baseline algorithms by up to 13.3%. Our algorithm is also extremely efficient, spending about 0.1s per cause analysis. Due to the attractive experimental results, our industrial partner, a leading information and communication technology company in the world, has deployed the tool and it achieves an average accuracy of 72% after two months of running, nearly three times more accurate than a previous strategy based on regular expressions.
SEJun 5, 2015
Dynamic Analysis can be Improved with Automatic Test Suite RefactoringJifeng Xuan, Benoit Cornu, Matias Martinez et al.
Context: Developers design test suites to automatically verify that software meets its expected behaviors. Many dynamic analysis techniques are performed on the exploitation of execution traces from test cases. However, in practice, there is only one trace that results from the execution of one manually-written test case. Objective: In this paper, we propose a new technique of test suite refactoring, called B-Refactoring. The idea behind B-Refactoring is to split a test case into small test fragments, which cover a simpler part of the control flow to provide better support for dynamic analysis. Method: For a given dynamic analysis technique, our test suite refactoring approach monitors the execution of test cases and identifies small test cases without loss of the test ability. We apply B-Refactoring to assist two existing analysis tasks: automatic repair of if-statements bugs and automatic analysis of exception contracts. Results: Experimental results show that test suite refactoring can effectively simplify the execution traces of the test suite. Three real-world bugs that could previously not be fixed with the original test suite are fixed after applying B-Refactoring; meanwhile, exception contracts are better verified via applying B-Refactoring to original test suites. Conclusions: We conclude that applying B-Refactoring can effectively improve the purity of test cases. Existing dynamic analysis tasks can be enhanced by test suite refactoring.
SEMay 26, 2015
Automatic Repair of Real Bugs: An Experience Report on the Defects4J DatasetMatias Martinez, Thomas Durieux, Jifeng Xuan et al.
Defects4J is a large, peer-reviewed, structured dataset of real-world Java bugs. Each bug in Defects4J is provided with a test suite and at least one failing test case that triggers the bug. In this paper, we report on an experiment to explore the effectiveness of automatic repair on Defects4J. The result of our experiment shows that 47 bugs of the Defects4J dataset can be automatically repaired by state-of- the-art repair. This sets a baseline for future research on automatic repair for Java. We have manually analyzed 84 different patches to assess their real correctness. In total, 9 real Java bugs can be correctly fixed with test-suite based repair. This analysis shows that test-suite based repair suffers from under-specified bugs, for which trivial and incorrect patches still pass the test suite. With respect to practical applicability, it takes in average 14.8 minutes to find a patch. The experiment was done on a scientific grid, totaling 17.6 days of computation time. All their systems and experimental results are publicly available on Github in order to facilitate future research on automatic repair.
SEApr 11, 2014
Automatic Repair of Buggy If Conditions and Missing Preconditions with SMTFavio Demarco, Jifeng Xuan, Daniel Le Berre et al.
We present Nopol, an approach for automatically repairing buggy if conditions and missing preconditions. As input, it takes a program and a test suite which contains passing test cases modeling the expected behavior of the program and at least one failing test case embodying the bug to be repaired. It consists of collecting data from multiple instrumented test suite executions, transforming this data into a Satisfiability Modulo Theory (SMT) problem, and translating the SMT result -- if there exists one -- into a source code patch. Nopol repairs object oriented code and allows the patches to contain nullness checks as well as specific method calls.