https://github.com/renatahodovan/grammarinator
ANTLR v4 grammar-based test generator
https://github.com/renatahodovan/grammarinator
antlr4 bughunting fuzzer fuzzing grammar-based-testing hacktoberfest random-testing security test-automation
Last synced: about 1 month ago
JSON representation
ANTLR v4 grammar-based test generator
- Host: GitHub
- URL: https://github.com/renatahodovan/grammarinator
- Owner: renatahodovan
- License: other
- Created: 2017-05-15T14:02:45.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2025-02-28T09:02:03.000Z (2 months ago)
- Last Synced: 2025-04-13T00:49:28.734Z (about 1 month ago)
- Topics: antlr4, bughunting, fuzzer, fuzzing, grammar-based-testing, hacktoberfest, random-testing, security, test-automation
- Language: Python
- Homepage:
- Size: 755 KB
- Stars: 363
- Watchers: 11
- Forks: 62
- Open Issues: 10
-
Metadata Files:
- Readme: README.rst
- License: LICENSE.rst
Awesome Lists containing this project
README
=============
Grammarinator
=============
*ANTLRv4 grammar-based test generator*.. image:: https://img.shields.io/pypi/v/grammarinator?logo=python&logoColor=white
:target: https://pypi.org/project/grammarinator/
.. image:: https://img.shields.io/pypi/l/grammarinator?logo=open-source-initiative&logoColor=white
:target: https://pypi.org/project/grammarinator/
.. image:: https://img.shields.io/github/actions/workflow/status/renatahodovan/grammarinator/main.yml?branch=master&logo=github&logoColor=white
:target: https://github.com/renatahodovan/grammarinator/actions
.. image:: https://img.shields.io/coveralls/github/renatahodovan/grammarinator/master?logo=coveralls&logoColor=white
:target: https://coveralls.io/github/renatahodovan/grammarinator
.. image:: https://img.shields.io/readthedocs/grammarinator?logo=read-the-docs&logoColor=white
:target: http://grammarinator.readthedocs.io/en/latest/.. start included documentation
*Grammarinator* is a random test generator / fuzzer that creates test cases
according to an input ANTLR_ v4 grammar. The motivation behind this
grammar-based approach is to leverage the large variety of publicly
available `ANTLR v4 grammars`_.The `trophy page`_ of the found issues is available from the wiki.
.. _ANTLR: http://www.antlr.org
.. _`ANTLR v4 grammars`: https://github.com/antlr/grammars-v4
.. _`trophy page`: https://github.com/renatahodovan/grammarinator/wikiRequirements
============* Python_ >= 3.8
* Java_ SE >= 11 JRE or JDK (the latter is optional).. _Python: https://www.python.org
.. _Java: https://www.oracle.com/java/Install
=======To use *Grammarinator* in another project, it can be added to ``setup.cfg`` as
an install requirement (if using setuptools_ with declarative config):.. code-block:: ini
[options]
install_requires =
grammarinatorTo install *Grammarinator* manually, e.g., into a virtual environment, use
pip_::pip install grammarinator
The above approaches install the latest release of *Grammarinator* from PyPI_.
Alternatively, for the development version, clone the project and perform a
local install::pip install .
.. _setuptools: https://github.com/pypa/setuptools
.. _pip: https://pip.pypa.io
.. _PyPI: https://pypi.org/Usage
=====As a first step, *Grammarinator* takes an `ANTLR v4 grammar`_ and creates a test
generator script in Python3. Grammarinator supports a subset of the features
of the ANTLR grammar which is introduced in the Grammar overview section of the
documentation. The produced generator can be subclassed later to customize it
further if needed.Basic command-line syntax of test generator creation::
grammarinator-process -o --no-actions
..
**Notes**
*Grammarinator* uses the `ANTLR v4 grammar`_ format as its input, which
makes existing grammars (lexer and parser rules) easily reusable. However,
because of the inherently different goals of a fuzzer and a parser, inlined
code (actions and conditions, header and members blocks) are most probably
not reusable, or even preventing proper execution. For first experiments
with existing grammar files, ``grammarinator-process`` supports the
command-line option ``--no-actions``, which skips all such code blocks
during fuzzer generation. Once inlined code is tuned for fuzzing, that
option may be omitted... _`ANTLR v4 grammar`: https://github.com/antlr/grammars-v4
After having generated and optionally customized a fuzzer, it can be executed
by the ``grammarinator-generate`` script (or by manually instantiating it in a
custom-written driver, of course).Basic command-line syntax of ``grammarinator-generate``::
grammarinator-generate -r -d \
-o -n \
-t -tBeside generating test cases from scratch based on the ANTLR grammar,
Grammarinator is also able to recombine existing inputs or mutate only a small
portion of them. To use these additional generation approaches, a population of
selected test cases has to be prepared. The preparation happens with the
``grammarinator-parse`` tool, which processes the input files with an ANTLR
grammar (possibly with the same one as the generator grammar) and builds
grammarinator tree representations from them (with .grt extension). Having a
population of such .grt files, ``grammarinator-generate`` can make use of them
with the ``--population`` cli option. If the ``--population`` option is set,
then Grammarinator will choose a strategy (generation, mutation, or
recombination) randomly at the creation of every new test case. If any of the
strategies is unwanted, they can be disabled with the ``--no-generate``,
``--no-mutate`` or ``--no-recombine`` options.Basic command line syntax of ``grammarinator-parse``::
grammarinator-parse -r \
-i -o..
**Notes**
Real-life grammars often use recursive rules to express certain patterns.
However, when using such rule(s) for generation, we can easily end up in an
unexpectedly deep call stack. With the ``--max-depth`` or ``-d`` options,
this depth - and also the size of the generated test cases - can be
controlled.Another specialty of the ANTLR grammars is that they support so-called
hidden tokens. These rules typically describe such elements of the target
language that can be placed basically anywhere without breaking the syntax.
The most common examples are comments or whitespaces. However, when using
these grammars - which don't define explicitly where whitespace may or may
not appear in rules - to generate test cases, we have to insert the missing
spaces manually. This can be done by applying a serializer (with the ``-s``
option) to the tree representation of the output tests. A simple serializer
- that inserts a space after every unparser rule - is provided by
*Grammarinator* (``grammarinator.runtime.simple_space_serializer``).In some cases, we may want to postprocess the output tree itself (without
serializing it). For example, to enforce some logic that cannot be expressed
by a context-free grammar. For this purpose the transformer mechanism can be
used (with the ``-t`` option). Similarly to the serializers, it will take a
tree as input, but instead of creating a string representation, it is
expected to return the modified (transformed) tree object.As a final thought, one must not forget that the original purpose of
grammars is the syntax-wise validation of various inputs. As a consequence,
these grammars encode syntactic expectations only and not semantic rules. If
we still want to add semantic knowledge into the generated test, then we can
inherit custom fuzzers from the generated ones and redefine methods
corresponding to lexer or parser rules in ways that encode the required
knowledge (e.g.: HTMLCustomGenerator_)... _HTMLCustomGenerator: examples/fuzzer/HTMLCustomGenerator.py
Working Example
===============The repository contains a minimal example_ to generate HTML files. To give it
a try, run the processor first::grammarinator-process examples/grammars/HTMLLexer.g4 examples/grammars/HTMLParser.g4 \
-o examples/fuzzer/Then, use the generator to produce test cases::
grammarinator-generate HTMLCustomGenerator.HTMLCustomGenerator -r htmlDocument -d 20 \
-o examples/tests/test_%d.html -n 100 \
-s HTMLGenerator.html_space_serializer \
--sys-path examples/fuzzer/.. _example: examples/
Compatibility
=============*Grammarinator* was tested on:
* Linux (Ubuntu 16.04 / 18.04 / 20.04)
* OS X / macOS (10.12 / 10.13 / 10.14 / 10.15 / 11)
* Windows (Server 2012 R2 / Server version 1809 / Windows 10)Citations
=========Background on *Grammarinator* is published in:
* Renata Hodovan, Akos Kiss, and Tibor Gyimothy. Grammarinator: A Grammar-Based
Open Source Fuzzer.
In Proceedings of the 9th ACM SIGSOFT International Workshop on Automating
Test Case Design, Selection, and Evaluation (A-TEST 2018), pages 45-48, Lake
Buena Vista, Florida, USA, November 2018. ACM.
https://doi.org/10.1145/3278186.3278193.. end included documentation
Copyright and Licensing
=======================Licensed under the BSD 3-Clause License_.
.. _License: LICENSE.rst