19.6SEApr 21
Revisiting Code Debloating with Ground Truth-based EvaluationMuhammad Bilal, Moiz Ali, Mohit Kumar et al.
Program debloating aims to remove unused code to reduce performance overhead, attack surfaces, and maintenance costs. Over time, debloating has evolved across multiple layers (container, library, and application), each building on the principles of application-level debloating. Despite its central role, application-level debloating continues to rely on imperfect proxies for measuring performance, such as test-case-driven evaluation for correctness, code size for runtime efficiency, and gadget count reduction for estimating security posture. While there is widespread skepticism about using such imperfect proxies, the community still lacks standardized methodologies or benchmarks to assess the true performance of application-level software debloating. This experience paper aims to address the gap. We revisit the foundations of application-level debloating through a ground-truth-based evaluation paradigm. Our analysis of eight state-of-the-art debloaters - Blade, Chisel, Cov, CovA, Lmcas, Trimmer, Occam, and Razor - uncovers insights previously unattainable through traditional evaluations. These tools collectively span the spectrum of source-to-source, IR-to-IR, and binary-to-binary transformation paradigms, characterizing a holistic reassessment across abstraction levels. Our analysis reveals that while dynamic analysis-based tools often remove up to 94% of code that should be retained, static analysis-based approaches exhibit the opposite behavior, showing high false retention rates due to coarse-grained dependency over-approximation. Additionally, static analyses may add code by introducing specialized variants of functions. False retentions and removals not only cause functional incorrectness but may also lead to systematic inconsistency, robustness failures, and exploitable vulnerabilities.
CRJun 24, 2021
The Queen's Guard: A Secure Enforcement of Fine-grained Access Control In Distributed Data Analytics PlatformsFahad Shaon, Sazzadur Rahaman, Murat Kantarcioglu
Distributed data analytics platforms (i.e., Apache Spark, Hadoop) provide high-level APIs to programmatically write analytics tasks that are run distributedly in multiple computing nodes. The design of these frameworks was primarily motivated by performance and usability. Thus, the security takes a back seat. Consequently, they do not inherently support fine-grained access control or offer any plugin mechanism to enable it, making them risky to be used in multi-tier organizational settings. There have been attempts to build "add-on" solutions to enable fine-grained access control for distributed data analytics platforms. In this paper, first, we show that straightforward enforcement of ``add-on'' access control is insecure under adversarial code execution. Specifically, we show that an attacker can abuse platform-provided APIs to evade access controls without leaving any traces. Second, we designed a two-layered (i.e., proactive and reactive) defense system to protect against API abuses. On submission of a user code, our proactive security layer statically screens it to find potential attack signatures prior to its execution. The reactive security layer employs code instrumentation-based runtime checks and sandboxed execution to throttle any exploits at runtime. Next, we propose a new fine-grained access control framework with an enhanced policy language that supports map and filter primitives. Finally, we build a system named SecureDL with our new access control framework and defense system on top of Apache Spark, which ensures secure access control policy enforcement under adversaries capable of executing code. To the best of our knowledge, this is the first fine-grained attribute-based access control framework for distributed data analytics platforms that is secure against platform API abuse attacks. Performance evaluation showed that the overhead due to added security is low.
CRJun 18, 2018
CryptoGuard: High Precision Detection of Cryptographic Vulnerabilities in Massive-sized Java ProjectsSazzadur Rahaman, Ya Xiao, Sharmin Afrose et al.
Cryptographic API misuses, such as exposed secrets, predictable random numbers, and vulnerable certificate verification, seriously threaten software security. The vision of automatically screening cryptographic API calls in massive-sized (e.g., millions of LoC) Java programs is not new. However, hindered by the practical difficulty of reducing false positives without compromising analysis quality, this goal has not been accomplished. State-of-the-art crypto API screening solutions are not designed to operate on a large scale. Our technical innovation is a set of fast and highly accurate slicing algorithms. Our algorithms refine program slices by identifying language-specific irrelevant elements. The refinements reduce false alerts by 76% to 80% in our experiments. Running our tool, CrytoGuard, on 46 high-impact large-scale Apache projects and 6,181 Android apps generate many security insights. Our findings helped multiple popular Apache projects to harden their code, including Spark, Ranger, and Ofbiz. We also have made substantial progress towards the science of analysis in this space, including: i) manually analyzing 1,295 Apache alerts and confirming 1,277 true positives (98.61% precision), ii) creating a benchmark with 38-unit basic cases and 74-unit advanced cases, iii) performing an in-depth comparison with leading solutions including CrySL, SpotBugs, and Coverity. We are in the process of integrating CryptoGuard with the Software Assurance Marketplace (SWAMP).