SEMar 21, 2021Code
An Empirical Study of OSS-Fuzz BugsZhen Yu Ding, Claire Le Goues
Continuous fuzzing is an increasingly popular technique for automated quality and security assurance. Google maintains OSS-Fuzz: a continuous fuzzing service for open source software. We conduct the first empirical study of OSS-Fuzz, analyzing 23,907 bugs found in 316 projects. We examine the characteristics of fuzzer-found faults, the lifecycles of such faults, and the evolution of fuzzing campaigns over time. We find that OSS-Fuzz is often effective at quickly finding bugs, and developers are often quick to patch them. However, flaky bugs, timeouts, and out of memory errors are problematic, people rarely file CVEs for security vulnerabilities, and fuzzing campaigns often exhibit punctuated equilibria, where developers might be surprised by large spikes in bugs found. Our findings have implications on future fuzzing research and practice.
CRAug 13, 2020
Sniffing for Codebase Secret Leaks with Known Production Secrets in IndustryZhen Yu Ding, Benjamin Khakshoor, Justin Paglierani et al.
Leaked secrets, such as passwords and API keys, in codebases were responsible for numerous security breaches. Existing heuristic techniques, such as pattern matching, entropy analysis, and machine learning, exist to detect and alert developers of such leaks. Heuristics, however, naturally exhibit false positives, which require triaging and can lead to developer frustration. We propose to use known production secrets as a source of ground truth for sniffing secret leaks in codebases. We develop techniques for using known secrets to sniff whole codebases and continuously sniff differential code revisions. We uncover different performance and security needs when sniffing for known secrets in these two situations in an industrial environment.
SEMar 25, 2020
Patch Quality and Diversity of Invariant-Guided Search-Based Program RepairZhen Yu Ding
Most automatic program repair techniques rely on test cases to specify correct program behavior. Due to test cases' frequently incomplete coverage of desired behavior, however, patches often overfit and fail to generalize to broader requirements. Moreover, in the absence of perfectly correct outputs, methods to ensure higher patch quality, such as merging together several patches or a human evaluating patch recommendations, benefit from having access to a diverse set of patches, making patch diversity a potentially useful trait. We evaluate the correctness and diversity of patches generated by GenProg and an invariant-based diversity-enhancing extension described in our prior work. We find no evidence that promoting diversity changes the correctness of patches in a positive or negative direction. Using invariant- and test case generation-driven metrics for measuring semantic diversity, we find no observed semantic differences between patches for most bugs, regardless of the repair technique used.
NEJun 27, 2019
The State and Future of Genetic ImprovementWilliam B. Langdon, Westley Weimer, Christopher Timperley et al.
We report the discussion session at the sixth international Genetic Improvement workshop, GI-2019 @ ICSE, which was held as part of the 41st ACM/IEEE International Conference on Software Engineering on Tuesday 28th May 2019. Topics included GI representations, the maintainability of evolved code, automated software testing, future areas of GI research, such as co-evolution, and existing GI tools and benchmarks.