SENov 5, 2016

Application-layer Fault-Tolerance Protocols

arXiv:1611.02273v14 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for more resilient systems against faults like design errors or malicious attacks, but it appears incremental as it builds on existing software fault-tolerance concepts.

The book tackles the problem of achieving dependable computer systems by emphasizing the necessity of embedding fault-tolerance directly in the application software layer, arguing that other approaches like hardware or operating system fault-tolerance are insufficient on their own.

The central topic of this book is application-level fault-tolerance, that is the methods, architectures, and tools that allow to express a fault-tolerant system in the application software of our computers. Application-level fault-tolerance is a sub-class of software fault-tolerance that focuses on the problems of expressing the problems and solutions of fault-tolerance in the top layer of the hierarchy of virtual machines that constitutes our computers. This book shows that application-level fault-tolerance is a key ingredient to craft truly dependable computer systems--other approaches, such as hardware fault-tolerance, operating system fault-tolerance, or fault-tolerant middleware, are also important ingredients to achieve resiliency, but they are not enough. Failing to address the application layer means leaving a backdoor open to problems such as design faults, interaction faults, or malicious attacks, whose consequences on the quality of service could be as unfortunate as, e.g., a physical fault affecting the system platform. In other words, in most cases it is simply not possible to achieve complete coverage against a given set of faults or erroneous conditions without embedding fault-tolerance provisions also in the application layer.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes