TFix+: Self-configuring Hybrid Timeout Bug Fixing for Cloud Systems
This work addresses timeout bug fixing for cloud systems, offering a hybrid approach that is incremental over prior solutions for specific bug types.
The paper tackles the problem of fixing timeout bugs in cloud systems, which cause availability and performance issues, by introducing TFix+, a self-configuring framework that automatically corrects misused and missing timeout bugs with dynamic predictions, achieving a fix rate of 15 out of 16 real-world bugs.
Timeout bugs can cause serious availability and performance issues which are often difficult to fix due to the lack of diagnostic information. Previous work proposed solutions for fixing specific type of timeout-related performance bugs. In this paper, we present TFix+, a self-configuring timeout bug fixing framework for automatically correcting two major kinds of timeout bugs (i.e., misused timeout bugs and missing timeout bugs) with dynamic timeout value predictions. TFix+ provides two new hybrid schemes for fixing misused and missing timeout bugs, respectively. TFix+ further provides prediction-driven timeout variable configuration based on runtime function tracing. We have implemented a prototype of TFix+ and conducted experiments on 16 real world timeout bugs. Our experimental results show that TFix+ can effectively fix 15 out of tested 16 timeout bugs.