SEDec 18, 2018

Inputs from Hell: Generating Uncommon Inputs from Common Samples

arXiv:1812.07525v22 citations
Originality Incremental advance
AI Analysis

This work addresses the need for robust testing in software engineering by enabling the generation of inputs that are either similar to or dissimilar from normal usage, which is useful for reliability and robustness testing, though it is incremental as it builds on existing grammar-based methods.

The paper tackled the problem of generating both common and uncommon structured inputs for software testing by using a context-free grammar to parse sample inputs and assign probabilities to grammar productions, then generating similar inputs by replicating these probabilities and dissimilar inputs by inverting them. The evaluation on JSON, JavaScript, and CSS formats demonstrated the effectiveness of this approach in producing inputs from both sets.

Generating structured input files to test programs can be performed by techniques that produce them from a grammar that serves as the specification for syntactically correct input files. Two interesting scenarios then arise for effective testing. In the first scenario, software engineers would like to generate inputs that are as similar as possible to the inputs in common usage of the program, to test the reliability of the program. More interesting is the second scenario where inputs should be as dissimilar as possible from normal usage. This is useful for robustness testing and exploring yet uncovered behavior. To provide test cases for both scenarios, we leverage a context-free grammar to parse a set of sample input files that represent the program's common usage, and determine probabilities for individual grammar production as they occur during parsing the inputs. Replicating these probabilities during grammar-based test input generation, we obtain inputs that are close to the samples. Inverting these probabilities yields inputs that are strongly dissimilar to common inputs, yet still valid with respect to the grammar. Our evaluation on three common input formats (JSON, JavaScript, CSS) shows the effectiveness of these approaches in obtaining instances from both sets of inputs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes