86.2DCMay 18
CausalMesh: A Formally Verified Causally Consistent Distributed Cache with Support for Client MigrationHaoran Zhang, Zihao Zhang, Shuai Mu et al.
Cloud applications often insert a caching lay\-er in front of a database in order to reduce I/O latency and improve throughput. One complication occurs when a client fetches some data from one cache node, then migrates to another (e.g., due to failures, load balancing, or client mobility), where it fetches the remaining data. If the data in the cache nodes is inconsistent, the client could observe states that undermine the application's correctness. One example of a situation where this is common is stateful serverless workflows, which consist of multiple serverless functions that access state in a remote database. In serverless, functions in the same workflow may be scheduled to different nodes with different caches, resulting in the migration pattern described above -- the same client (the workflow) reads some data from one cache and other data from another. To address this issue, this paper presents CausalMesh, a novel approach to causally consistent distributed caching in environments where computations may migrate between machines. CausalMesh is the first cache system to support coordination-free, abort-free read/write operations and read transactions when clients migrate across multiple servers. CausalMesh also supports read-write transactional causal consistency in the presence of client migration, but at the cost of abort-freedom. Our experimental evaluation shows that CausalMesh has lower latency and higher throughput than existing proposals. Finally, we have formally verified the correctness of \sys's protocol in Dafny.
89.4DBApr 22
Making TransactionIsolation Checking PracticalJian Zhang, Shuai Mu, Cheng Tan
Checking whether database transactions adhere to isolation levels is a crucial yet challenging problem. We present Boomslang, the first general-purpose checking framework capable of verifying configurations that were previously uncheckable. Boomslang advances beyond prior work in three key aspects: (1) it supports arbitrary operation types provided by modern transactional key-value stores, (2) it requires no knowledge of database internals, and (3) it offers a modular, extensible pipeline amenable to customization and optimizations. Boomslang adopts a front-/back-end separation. As the front-end, it parses a database trace into an Abstract Semantic Graph, which is then lowered -- via semantic analysis -- into a low-level intermediate representation (IR). The back-end converts this IR to a set of constraints for SMT solving. This design is enabled by a key abstraction in the IR, called superpositions, which capture the uncertainty and complexity caused by arbitrary operations and missing information. Our experiments show that with just 271--386 lines of code, the core logic of three prior checkers can be reimplemented as Boomslang modules, achieving comparable or superior performance. Using Boomslang, we also identify a new bug in TiDB, audit the metadata layer of the JuiceFS file system, check vendor-specific behavior in MariaDB, support five previously unchecked isolation levels, and confirm a theoretical result on the correctness of strict serializability.
CVMay 18, 2018
Improving Image Captioning with Conditional Generative Adversarial NetsChen Chen, Shuai Mu, Wanpeng Xiao et al.
In this paper, we propose a novel conditional-generative-adversarial-nets-based image captioning framework as an extension of traditional reinforcement-learning (RL)-based encoder-decoder architecture. To deal with the inconsistent evaluation problem among different objective language metrics, we are motivated to design some "discriminator" networks to automatically and progressively determine whether generated caption is human described or machine generated. Two kinds of discriminator architectures (CNN and RNN-based structures) are introduced since each has its own advantages. The proposed algorithm is generic so that it can enhance any existing RL-based image captioning framework and we show that the conventional RL training method is just a special case of our approach. Empirically, we show consistent improvements over all language evaluation metrics for different state-of-the-art image captioning models. In addition, the well-trained discriminators can also be viewed as objective image captioning evaluators