{"id":13474359,"url":"https://github.com/google/atheris","last_synced_at":"2025-05-14T13:07:50.811Z","repository":{"id":39406636,"uuid":"313445559","full_name":"google/atheris","owner":"google","description":null,"archived":false,"fork":false,"pushed_at":"2024-06-17T13:53:21.000Z","size":501,"stargazers_count":1467,"open_issues_count":31,"forks_count":117,"subscribers_count":38,"default_branch":"master","last_synced_at":"2025-05-13T14:09:30.733Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-11-16T22:43:28.000Z","updated_at":"2025-05-11T16:35:34.000Z","dependencies_parsed_at":"2024-01-13T00:38:29.918Z","dependency_job_id":"5fb1160f-6584-412c-acec-4479de0c1330","html_url":"https://github.com/google/atheris","commit_stats":{"total_commits":243,"total_committers":29,"mean_commits":8.379310344827585,"dds":0.6460905349794239,"last_synced_commit":"cbf4ad989dcb4d3ef42152990ed89cfceb50e059"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fatheris","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fatheris/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fatheris/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fatheris/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google","download_url":"https://codeload.github.com/google/atheris/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253958330,"owners_count":21990548,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T16:01:11.784Z","updated_at":"2025-05-14T13:07:45.800Z","avatar_url":"https://github.com/google.png","language":"Python","readme":"# Atheris: A Coverage-Guided, Native Python Fuzzer\n\nAtheris is a coverage-guided Python fuzzing engine. It supports fuzzing of Python code, but also native extensions written for CPython. Atheris is based off of libFuzzer. When fuzzing native code, Atheris can be used in combination with Address Sanitizer or Undefined Behavior Sanitizer to catch extra bugs.\n\n## Installation Instructions\n\nAtheris supports Linux (32- and 64-bit) and Mac OS X, Python versions 3.6-3.11.\n\nYou can install prebuilt versions of Atheris with pip:\n\n```bash\npip3 install atheris\n```\n\nThese wheels come with a built-in libFuzzer, which is fine for fuzzing Python\ncode. If you plan to fuzz native extensions, you may need to build from source\nto ensure the libFuzzer version in Atheris matches your Clang version.\n\n### Building from Source\n\nAtheris relies on libFuzzer, which is distributed with Clang. If you have a sufficiently new version of `clang` on your path, installation from source is as simple as:\n\n```bash\n# Build latest release from source\npip3 install --no-binary atheris atheris\n# Build development code from source\ngit clone https://github.com/google/atheris.git\ncd atheris\npip3 install .\n```\n\nIf you don't have `clang` installed or it's too old, you'll need to download and build the latest version of LLVM. Follow the instructions in Installing Against New LLVM below.\n\n#### Mac\n\nApple Clang doesn't come with libFuzzer, so you'll need to install a new version of LLVM from head. Follow the instructions in Installing Against New LLVM below.\n\n#### Installing Against New LLVM\n\n```bash\n# Building LLVM\ngit clone https://github.com/llvm/llvm-project.git\ncd llvm-project\nmkdir build\ncd build\ncmake -DLLVM_ENABLE_PROJECTS='clang;compiler-rt' -G \"Unix Makefiles\" ../llvm\nmake -j 10  # This step is very slow\n\n# Installing Atheris\nCLANG_BIN=\"$(pwd)/bin/clang\" pip3 install \u003cwhatever\u003e\n```\n\n## Using Atheris\n\n### Example\n\n```python\n#!/usr/bin/python3\n\nimport atheris\n\nwith atheris.instrument_imports():\n  import some_library\n  import sys\n\ndef TestOneInput(data):\n  some_library.parse(data)\n\natheris.Setup(sys.argv, TestOneInput)\natheris.Fuzz()\n```\n\nWhen fuzzing Python, Atheris will report a failure if the Python code under test throws an uncaught exception.\n\n### Python coverage\n\nAtheris collects Python coverage information by instrumenting bytecode.\nThere are 3 options for adding this instrumentation to the bytecode:\n\n - You can instrument the libraries you import:\n\n   ```python\n   with atheris.instrument_imports():\n     import foo\n     from bar import baz\n   ```\n   This will cause instrumentation to be added to `foo` and `bar`, as well as\n   any libraries they import.\n - Or, you can instrument individual functions:\n\n   ```python\n   @atheris.instrument_func\n   def my_function(foo, bar):\n     print(\"instrumented\")\n   ```\n - Or finally, you can instrument everything:\n\n   ```python\n   atheris.instrument_all()\n   ```\n   Put this right before `atheris.Setup()`. This will find every Python function\n   currently loaded in the interpreter, and instrument it.\n   This might take a while.\n\nAtheris can additionally instrument regular expression checks, e.g. `re.search`.\nTo enable this feature, you will need to add:\n`atheris.enabled_hooks.add(\"RegEx\")`\nTo your script before your code calls `re.compile`.\nInternally this will import the `re` module and instrument the necessary functions.\nThis is currently an experimental feature.\n\nSimilarly, Atheris can instrument str methods; currently only `str.startswith`\nand `str.endswith` are supported. To enable this feature, add\n`atheris.enabled_hooks.add(\"str\")`. This is currently an experimental feature.\n\n#### Why am I getting \"No interesting inputs were found\"?\n\nYou might see this error:\n\n```\nERROR: no interesting inputs were found. Is the code instrumented for coverage? Exiting.\n```\n\nYou'll get this error if the first 2 calls to `TestOneInput` didn't produce any\ncoverage events. Even if you have instrumented some Python code,\nthis can happen if the instrumentation isn't reached in those first 2 calls.\n(For example, because you have a nontrivial `TestOneInput`). You can resolve\nthis by adding an `atheris.instrument_func` decorator to `TestOneInput`,\nusing `atheris.instrument_all()`, or moving your `TestOneInput` function into an\ninstrumented module.\n\n\n### Visualizing Python code coverage\nExamining which lines are executed is helpful for understanding the\neffectiveness of your fuzzer. Atheris is compatible with\n[`coverage.py`](https://coverage.readthedocs.io/): you can run your fuzzer using\nthe `coverage.py` module as you would for any other Python program. Here's an\nexample:\n\n```bash\npython3 -m coverage run your_fuzzer.py -atheris_runs=10000  # Times to run\npython3 -m coverage html\n(cd htmlcov \u0026\u0026 python3 -m http.server 8000)\n```\n\nCoverage reports are only generated when your fuzzer exits gracefully. This\nhappens if:\n\n - you specify `-atheris_runs=\u003cnumber\u003e`, and that many runs have elapsed.\n - your fuzzer exits by Python exception.\n - your fuzzer exits by `sys.exit()`.\n\nNo coverage report will be generated if your fuzzer exits due to a\ncrash in native code, or due to libFuzzer's `-runs` flag (use `-atheris_runs`).\nIf your fuzzer exits via other methods, such as SIGINT (Ctrl+C), Atheris will\nattempt to generate a report but may be unable to (depending on your code).\nFor consistent reports, we recommend always using\n`-atheris_runs=\u003cnumber\u003e`.\n\nIf you'd like to examine coverage when running with your corpus, you can do\nthat with the following command:\n\n```\npython3 -m coverage run your_fuzzer.py corpus_dir/* -atheris_runs=$(( 1 + $(ls corpus_dir | wc -l) ))\n```\n\nThis will cause Atheris to run on each file in `\u003ccorpus-dir\u003e`, then exit.\nNote: atheris use empty data set as the first input even if there is no empty file in `\u003ccorpus_dir\u003e`.\nImportantly, if you leave off the `-atheris_runs=$(ls corpus_dir | wc -l)`, no\ncoverage report will be generated.\n\nUsing coverage.py will significantly slow down your fuzzer, so only use it for\nvisualizing coverage; don't use it all the time.\n\n### Fuzzing Native Extensions\n\nIn order for fuzzing native extensions to be effective, your native extensions\nmust be instrumented. See [Native Extension Fuzzing](https://github.com/google/atheris/blob/master/native_extension_fuzzing.md)\nfor instructions.\n\n### Structure-aware Fuzzing\n\nAtheris is based on a coverage-guided mutation-based fuzzer (LibFuzzer). This\nhas the advantage of not requiring any grammar definition for generating inputs,\nmaking its setup easier. The disadvantage is that it will be harder for the\nfuzzer to generate inputs for code that parses complex data types. Often the\ninputs will be rejected early, resulting in low coverage.\n\nAtheris supports custom mutators\n[(as offered by LibFuzzer)](https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md)\nto produce grammar-aware inputs.\n\nExample (Atheris-equivalent of the\n[example in the LibFuzzer docs](https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md#example-compression)):\n\n```python\n@atheris.instrument_func\ndef TestOneInput(data):\n  try:\n    decompressed = zlib.decompress(data)\n  except zlib.error:\n    return\n\n  if len(decompressed) \u003c 2:\n    return\n\n  try:\n    if decompressed.decode() == 'FU':\n      raise RuntimeError('Boom')\n  except UnicodeDecodeError:\n    pass\n```\n\nTo reach the `RuntimeError` crash, the fuzzer needs to be able to produce inputs\nthat are valid compressed data and satisfy the checks after decompression.\nIt is very unlikely that Atheris will be able to produce such inputs: mutations\non the input data will most probably result in invalid data that will fail at\ndecompression-time.\n\nTo overcome this issue, you can define a custom mutator function (equivalent to\n`LLVMFuzzerCustomMutator`).\nThis example produces valid compressed data. To enable Atheris to make use of\nit, pass the custom mutator function to the invocation of `atheris.Setup`.\n\n```python\ndef CustomMutator(data, max_size, seed):\n  try:\n    decompressed = zlib.decompress(data)\n  except zlib.error:\n    decompressed = b'Hi'\n  else:\n    decompressed = atheris.Mutate(decompressed, len(decompressed))\n  return zlib.compress(decompressed)\n\natheris.Setup(sys.argv, TestOneInput, custom_mutator=CustomMutator)\natheris.Fuzz()\n```\n\nAs seen in the example, the custom mutator may request Atheris to mutate data\nusing `atheris.Mutate()` (this is equivalent to `LLVMFuzzerMutate`).\n\nYou can experiment with [custom_mutator_example.py](example_fuzzers/custom_mutator_example.py)\nand see that without the mutator Atheris would not be able to find the crash,\nwhile with the mutator this is achieved in a matter of seconds.\n\n```shell\n$ python3 example_fuzzers/custom_mutator_example.py --no_mutator\n[...]\n#2      INITED cov: 2 ft: 2 corp: 1/1b exec/s: 0 rss: 37Mb\n#524288 pulse  cov: 2 ft: 2 corp: 1/1b lim: 4096 exec/s: 262144 rss: 37Mb\n#1048576        pulse  cov: 2 ft: 2 corp: 1/1b lim: 4096 exec/s: 349525 rss: 37Mb\n#2097152        pulse  cov: 2 ft: 2 corp: 1/1b lim: 4096 exec/s: 299593 rss: 37Mb\n#4194304        pulse  cov: 2 ft: 2 corp: 1/1b lim: 4096 exec/s: 279620 rss: 37Mb\n[...]\n\n$ python3 example_fuzzers/custom_mutator_example.py\n[...]\nINFO: found LLVMFuzzerCustomMutator (0x7f9c989fb0d0). Disabling -len_control by default.\n[...]\n#2      INITED cov: 2 ft: 2 corp: 1/1b exec/s: 0 rss: 37Mb\n#3      NEW    cov: 4 ft: 4 corp: 2/11b lim: 4096 exec/s: 0 rss: 37Mb L: 10/10 MS: 1 Custom-\n#12     NEW    cov: 5 ft: 5 corp: 3/21b lim: 4096 exec/s: 0 rss: 37Mb L: 10/10 MS: 7 Custom-CrossOver-Custom-CrossOver-Custom-ChangeBit-Custom-\n === Uncaught Python exception: ===\nRuntimeError: Boom\nTraceback (most recent call last):\n  File \"example_fuzzers/custom_mutator_example.py\", line 62, in TestOneInput\n    raise RuntimeError('Boom')\n[...]\n```\n\nCustom crossover functions (equivalent to `LLVMFuzzerCustomCrossOver`) are also\nsupported. You can pass the custom crossover function to the invocation of\n`atheris.Setup`. See its usage in [custom_crossover_fuzz_test.py](src/custom_crossover_fuzz_test.py).\n\n#### Structure-aware Fuzzing with Protocol Buffers\n\n[libprotobuf-mutator](https://github.com/google/libprotobuf-mutator) has\nbindings to use it together with Atheris to perform structure-aware fuzzing\nusing protocol buffers.\n\nSee the documentation for\n[atheris_libprotobuf_mutator](contrib/libprotobuf_mutator/README.md).\n\n## Integration with OSS-Fuzz\n\nAtheris is fully supported by [OSS-Fuzz](https://github.com/google/oss-fuzz), Google's continuous fuzzing service for open source projects. For integrating with OSS-Fuzz, please see [https://google.github.io/oss-fuzz/getting-started/new-project-guide/python-lang](https://google.github.io/oss-fuzz/getting-started/new-project-guide/python-lang).\n\n## API\n\nThe `atheris` module provides three key functions: `instrument_imports()`, `Setup()` and `Fuzz()`.\n\nIn your source file, import all libraries you wish to fuzz inside a `with atheris.instrument_imports():`-block, like this:\n\n```python\n# library_a will not get instrumented\nimport library_a\n\nwith atheris.instrument_imports():\n    # library_b will get instrumented\n    import library_b\n```\n\nGenerally, it's best to import `atheris` first and then import all other libraries inside of a `with atheris.instrument_imports()` block.\n\nNext, define a fuzzer entry point function and pass it to `atheris.Setup()` along with the fuzzer's arguments (typically `sys.argv`). Finally, call `atheris.Fuzz()` to start fuzzing. You must call `atheris.Setup()` before `atheris.Fuzz()`.\n\n#### `instrument_imports(include=[], exclude=[])`\n- `include`: A list of fully-qualified module names that shall be instrumented.\n- `exclude`: A list of fully-qualified module names that shall NOT be instrumented.\n\nThis should be used together with a `with`-statement. All modules imported in\nsaid statement will be instrumented. However, because Python imports all modules\nonly once, this cannot be used to instrument any previously imported module,\nincluding modules required by Atheris. To add coverage to those modules, use\n`instrument_all()` instead.\n\nA full list of unsupported modules can be retrieved as follows:\n\n```python\nimport sys\nimport atheris\nprint(sys.modules.keys())\n```\n\n\n#### `instrument_func(func)`\n - `func`: The function to instrument.\n\nThis will instrument the specified Python function and then return `func`. This\nis typically used as a decorator, but can be used to instrument individual\nfunctions too. Note that the `func` is instrumented in-place, so this will\naffect all call points of the function.\n\nThis cannot be called on a bound method - call it on the unbound version.\n\n#### `instrument_all()`\n\nThis will scan over all objects in the interpreter and call `instrument_func` on\nevery Python function. This works even on core Python interpreter functions,\nsomething which `instrument_imports` cannot do.\n\nThis function is experimental.\n\n\n#### `Setup(args, test_one_input, internal_libfuzzer=None)`\n - `args`: A list of strings: the process arguments to pass to the fuzzer, typically `sys.argv`. This argument list may be modified in-place, to remove arguments consumed by the fuzzer.\n   See [the LibFuzzer docs](https://llvm.org/docs/LibFuzzer.html#options) for a list of such options.\n - `test_one_input`: your fuzzer's entry point. Must take a single `bytes` argument. This will be repeatedly invoked with a single bytes container.\n - `internal_libfuzzer`: Indicates whether libfuzzer will be provided by atheris or by an external library (see [native_extension_fuzzing.md](./native_extension_fuzzing.md)). If unspecified, Atheris will determine this\n   automatically. If fuzzing pure Python, leave this as `True`.\n\n#### `Fuzz()`\n\nThis starts the fuzzer. You must have called `Setup()` before calling this function. This function does not return.\n\nIn many cases `Setup()` and `Fuzz()` could be combined into a single function, but they are\nseparated because you may want the fuzzer to consume the command-line arguments it handles\nbefore passing any remaining arguments to another setup function.\n\n#### `FuzzedDataProvider`\n\nOften, a `bytes` object is not convenient input to your code being fuzzed. Similar to libFuzzer, we provide a FuzzedDataProvider to translate these bytes into other input forms.\n\nYou can construct the FuzzedDataProvider with:\n\n```python\nfdp = atheris.FuzzedDataProvider(input_bytes)\n```\n\nThe FuzzedDataProvider then supports the following functions:\n\n```python\ndef ConsumeBytes(count: int)\n```\nConsume `count` bytes.\n\n\n```python\ndef ConsumeUnicode(count: int)\n```\n\nConsume unicode characters. Might contain surrogate pair characters, which according to the specification are invalid in this situation. However, many core software tools (e.g. Windows file paths) support them, so other software often needs to too.\n\n```python\ndef ConsumeUnicodeNoSurrogates(count: int)\n```\n\nConsume unicode characters, but never generate surrogate pair characters.\n\n```python\ndef ConsumeString(count: int)\n```\n\nAlias for `ConsumeBytes` in Python 2, or `ConsumeUnicode` in Python 3.\n\n```python\ndef ConsumeInt(int: bytes)\n```\n\nConsume a signed integer of the specified size (when written in two's complement notation).\n\n```python\ndef ConsumeUInt(int: bytes)\n```\n\nConsume an unsigned integer of the specified size.\n\n```python\ndef ConsumeIntInRange(min: int, max: int)\n```\n\nConsume an integer in the range [`min`, `max`].\n\n```python\ndef ConsumeIntList(count: int, bytes: int)\n```\n\nConsume a list of `count` integers of `size` bytes.\n\n```python\ndef ConsumeIntListInRange(count: int, min: int, max: int)\n```\n\nConsume a list of `count` integers in the range [`min`, `max`].\n\n```python\ndef ConsumeFloat()\n```\n\nConsume an arbitrary floating-point value. Might produce weird values like `NaN` and `Inf`.\n\n```python\ndef ConsumeRegularFloat()\n```\n\nConsume an arbitrary numeric floating-point value; never produces a special type like `NaN` or `Inf`.\n\n```python\ndef ConsumeProbability()\n```\n\nConsume a floating-point value in the range [0, 1].\n\n```python\ndef ConsumeFloatInRange(min: float, max: float)\n```\n\nConsume a floating-point value in the range [`min`, `max`].\n\n```python\ndef ConsumeFloatList(count: int)\n```\n\nConsume a list of `count` arbitrary floating-point values. Might produce weird values like `NaN` and `Inf`.\n\n```python\ndef ConsumeRegularFloatList(count: int)\n```\n\nConsume a list of `count` arbitrary numeric floating-point values; never produces special types like `NaN` or `Inf`.\n\n```python\ndef ConsumeProbabilityList(count: int)\n```\n\nConsume a list of `count` floats in the range [0, 1].\n\n```python\ndef ConsumeFloatListInRange(count: int, min: float, max: float)\n```\n\nConsume a list of `count` floats in the range [`min`, `max`]\n\n```python\ndef PickValueInList(l: list)\n```\n\nGiven a list, pick a random value\n\n```python\ndef ConsumeBool()\n```\n\nConsume either `True` or `False`.\n","funding_links":[],"categories":["Python","Property Based Testing"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Fatheris","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle%2Fatheris","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Fatheris/lists"}