SEJul 25, 2018Code
Fast & Flexible IO : A Compositional Approach to Storage Construction for High-Performance DevicesDaniel G. Waddington
Building storage systems has remained the domain of systems experts for many years. They are complex and difficult to implement. Extreme care is needed to ensure necessary guarantees of performance and operational correctness. Furthermore, because of restrictions imposed by kernel-based designs, many legacy implementations have traded software flexibility for performance. Their implementation is restricted to compiled languages such as C and assembler, and reuse tends to be difficult or constrained. Nevertheless, storage systems are implicitly well-suited to software reuse and compositional software construction. There are many logical functions, such as block allocation, caching, partitioning, metadata management and so forth, that are common across most variants of storage. In this paper, we present Comanche, an open-source project that considers, as first-class concerns, both compositional design and reuse, and the need for high-performance.
LGFeb 28, 2022
Fast Feature Selection with Fairness ConstraintsFrancesco Quinzan, Rajiv Khanna, Moshik Hershcovitch et al.
We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work. Furthermore, our extension allows the use of downward-closed constraints, which can be used to encode certain fairness criteria into the feature selection process. We prove strong approximation guarantees for the algorithm based on standard assumptions. These guarantees are applicable to many parametric models, including Generalized Linear Models. Finally, we demonstrate empirically that the proposed algorithm competes favorably with state-of-the-art techniques for feature selection, on real-world and synthetic datasets.