A Formal Model of Checked C
This work provides a formal foundation for Checked C, aiding developers in ensuring memory safety during incremental porting, though it is incremental as it builds on existing language features.
The authors tackled the problem of formally modeling Checked C, a C dialect for spatial memory safety, by developing a Coq model with proofs that safety errors are confined to unchecked code portions, and validated the model through compilation to C-like code and randomized testing against the Clang implementation.
We present a formal model of Checked C, a dialect of C that aims to enforce spatial memory safety. Our model pays particular attention to the semantics of dynamically sized, potentially null-terminated arrays. We formalize this model in Coq, and prove that any spatial memory safety errors can be blamed on portions of the program labeled unchecked; this is a Checked C feature that supports incremental porting and backward compatibility. While our model's operational semantics uses annotated ("fat") pointers to enforce spatial safety, we show that such annotations can be safely erased: Using PLT Redex we formalize an executable version of our model and a compilation procedure from it to an untyped C-like language, and use randomized testing to validate that generated code faithfully simulates the original. Finally, we develop a custom random generator for well-typed and almost-well-typed terms in our Redex model, and use it to search for inconsistencies between our model and the Clang Checked C implementation. We find these steps to be a useful way to co-develop a language (Checked C is still in development) and a core model of it.