Björn Mathis

SE
3papers
32citations
Novelty58%
AI Score25

3 Papers

SEDec 25, 2020
Fuzzing with Fast Failure Feedback

Rahul Gopinath, Bachir Bendrissou, Björn Mathis et al.

Fuzzing -- testing programs with random inputs -- has become the prime technique to detect bugs and vulnerabilities in programs. To generate inputs that cover new functionality, fuzzers require execution feedback from the program -- for instance, the coverage obtained by previous inputs, or the conditions that need to be resolved to cover new branches. If such execution feedback is not available, though, fuzzing can only rely on chance, which is ineffective. In this paper, we introduce a novel fuzzing technique that relies on failure feedback only -- that is, information on whether an input is valid or not, and if not, where the error occurred. Our bFuzzer tool enumerates byte after byte of the input space and tests the program until it finds valid prefixes, and continues exploration from these prefixes. Since no instrumentation or execution feedback is required, bFuzzer is language agnostic and the required tests execute very quickly. We evaluate our technique on five subjects, and show that bFuzzer is effective and efficient even in comparison to its white-box counterpart.

SEDec 12, 2019
Inferring Input Grammars from Dynamic Control Flow

Rahul Gopinath, Björn Mathis, Andreas Zeller

A program is characterized by its input model, and a formal input model can be of use in diverse areas including vulnerability analysis, reverse engineering, fuzzing and software testing, clone detection and refactoring. Unfortunately, input models for typical programs are often unavailable or out of date. While there exist algorithms that can mine the syntactical structure of program inputs, they either produce unwieldy and incomprehensible grammars, or require heuristics that target specific parsing patterns. In this paper, we present a general algorithm that takes a program and a small set of sample inputs and automatically infers a readable context-free grammar capturing the input language of the program. We infer the syntactic input structure only by observing access of input characters at different locations of the input parser. This works on all program stack based recursive descent input parsers, including PEG and parser combinators, and can do entirely without program specific heuristics. Our Mimid prototype produced accurate and readable grammars for a variety of evaluation subjects, including expr, URLparse, and microJSON.

SEOct 18, 2018
Sample-Free Learning of Input Grammars for Comprehensive Software Fuzzing

Rahul Gopinath, Björn Mathis, Mathias Höschele et al.

Generating valid test inputs for a program is much easier if one knows the input language. We present first successes for a technique that, given a program P without any input samples or models, learns an input grammar that represents the syntactically valid inputs for P -- a grammar which can then be used for highly effective test generation for P . To this end, we introduce a test generator targeted at input parsers that systematically explores parsing alternatives based on dynamic tracking of constraints; the resulting inputs go into a grammar learner producing a grammar that can then be used for fuzzing. In our evaluation on subjects such as JSON, URL, or Mathexpr, our PYGMALION prototype took only a few minutes to infer grammars and generate thousands of valid high-quality inputs.