{"id":28334660,"url":"https://github.com/permutationlock/aven-cfmt","last_synced_at":"2025-06-17T08:31:36.386Z","repository":{"id":294318850,"uuid":"986597070","full_name":"permutationlock/aven-cfmt","owner":"permutationlock","description":"A C source code formatter","archived":false,"fork":false,"pushed_at":"2025-05-28T11:19:43.000Z","size":230,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-03T19:54:47.844Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit-0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/permutationlock.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.MIT","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-19T21:08:06.000Z","updated_at":"2025-05-28T11:19:46.000Z","dependencies_parsed_at":"2025-05-27T19:32:30.735Z","dependency_job_id":null,"html_url":"https://github.com/permutationlock/aven-cfmt","commit_stats":null,"previous_names":["permutationlock/aven-cfmt"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/permutationlock/aven-cfmt","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/permutationlock%2Faven-cfmt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/permutationlock%2Faven-cfmt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/permutationlock%2Faven-cfmt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/permutationlock%2Faven-cfmt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/permutationlock","download_url":"https://codeload.github.com/permutationlock/aven-cfmt/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/permutationlock%2Faven-cfmt/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260321906,"owners_count":22991720,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-26T21:28:04.665Z","updated_at":"2025-06-17T08:31:36.378Z","avatar_url":"https://github.com/permutationlock.png","language":"C","readme":"# The Aven C source code formatter\n\nThis repository contains a C source code lexer, parser, and AST\nrenderer in `aven/c.h`. All three are combined into a source code formating\napplication in `src/aven-cfmt.c`.\n\n## Limitations\n\nThe formatter is designed to parse code that follows the C99 or C11 standard\nwith some GNU extensions. It will parse GNU inline assembly and attribute\nspecifiers. It will also parse C++ `extern \"C\"` blocks to allow for \"universal\" header files.\n\nThe formatter does not perform semantic analysis or try to follow include directives. Therefore\nit is impossible to handle all possible uses of the\nC preprocessor. Instead, common uses of macros are parsed directly into the AST.\n\n### Macro definitions\n\nMacro `#define A` and `#define A(...)` directives are each parsed into a\nseparate AST. Each macro AST is rendered when the corresponding portion of the\nprimary AST is rendered.\nA macro definiton may be followed by a single token, a type name, an\ninitializer list, a list of declaration specifiers, or an expression statement,\na `do` statement, or a declaration, ommitting the terminating `;`.\nThe special `#` and `##` operators are allowed in preprocessor code.\n\n### Macro invocations\n\nA macro invocation is an identifier followed by an optional parenthesised parameter\nlist of assignment expressions and type names. Macro invocations may appear in\ntype names, declaration specifiers, and compound string literals. In addition, a\npostfix parenthesis expression is allowed to contain type names in its parameter\nlist to allow type parameterised macro invocations within expressions.\n\n### Conditional directives\n\nConditional directives (`#if`, `#else`, `#endif`, `#ifdef`, etc.)\nare parsed and rendered in exactly the same way as macro definitions.\nIn the primary translation unit AST, conditional directives are simply ignored.\nTherefore a source file must be still parseable when all\npreprocessor directive lines are removed. E.g. the following is invalid\naccording to `aven-cfmt`.\n```C\n#ifdef A\nint foo(int n) {\n#else\nint bar(int n){\n#endif\n    // body\n}\n```\n\n### Other directives\n\nInclude directives (`#include`) are parsed and pretty printed, but\nall other directives (`#error`, `#warning`, `#pragma`, etc.) are treated as\ncomments and rendered unmodified from source. The `_Pragma(\"...\")` variety\nof pragma directives are not allowed.\n\n### Character sets and white space\n\nSource files are assumed to be ascii or utf8 encoded. Each source file is\nverified to be valid utf8 prior to tokenization.\nThe tokenizer only allows non-ascii codepoints in comments, character\nconstants, and string literals. Each utf8 codepoint is counted as one column\nduring rendering.\n\nThe only whitespace characters that will be rendered are spaces, newlines, and tabs.\nWindows `\\r\\n` line endings will be parsed the same as `\\n`, and the formatter\nwill render all line endings as `\\n`.\nCarriage return characters `\\r` are illegal outside of line endings. All indents\nare rendered with space, tabs will only be\nrendered if they appear within comments or string literals.\n\n### Parse depth\n\nA combinatorial explosion of backtracking can\noccur with some pathological inputs. Fixing the underlying issue proved\ntoo tricky for the time being, so to solve this (and satisfy the allmighty fuzzer)\nI simply placed a limit on the depth of the parse tree.\nIf valid source code contains extremely long expressions or\nif-else chains, then the `--depth N` command line\nargument or the `// aven cfmt depth: N` control comment may be used to expand the limit.\n\n### Is this too restrictive?\n\nIt works for my code, but a surprisingly large portion of the repos I keep cloned on my machine are\nformattable out of the box as well. For example, `aven-cfmt --columns 0`\nwill accept most [musl][4], [Raylib][5], and [GLFW][6] source files. The files it refuses to format\ncontain either C++ code, unsupported macros (not using `do` statements, including\nterminating `;`), or extensions (MSVC\n`(__stdcall *fn)` declarators). It would be simple to modify such files\nto comply, but, of course, re-formatting code from well established projects is\nnot a goal of `aven-cfmt`.\n\n```Shell\n$ # aven-cfmt all Raylib .c files, only show errors\n$ for i in raylib/src/*.c; do echo $i \u0026\u0026 aven-cfmt --columns 128 $i \u003e /dev/null; done\nraylib/src/raudio.c\nraylib/src/rcore.c\nraylib/src/rglfw.c\nraylib/src/rmodels.c\nerror: expected punctuator '(':\n5247:9:         int n = 0; \\\n                ^\nraylib/src/rshapes.c\nraylib/src/rtext.c\nraylib/src/rtextures.c\nraylib/src/utils.c\n```\n```Shell\n$ # count number of files in musl-1.2.5/src/stdio\n$ ls -1 musl-1.2.5/src/stdio/ | wc -l\n     118\n$ # aven-cfmt all stdio files, only show errors\n$ for i in musl-1.2.5/src/stdio/*; do \\\n    echo $i \u0026\u0026 aven-cfmt $i; \\\ndone 2\u003e\u00261 | grep \"error:\" -B1 -A2\nmusl-1.2.5/src/stdio/vfprintf.c\nerror: expected identifier:\n47:14: #define S(x) [(x)-'A']\n                    ^\n--\nmusl-1.2.5/src/stdio/vfwprintf.c\nerror: expected identifier:\n40:14: #define S(x) [(x)-'A']\n                    ^\n```\n\n## Building\n\nEnsure that you have pulled the `libaven` submodule dependency.\n```Shell\n$ git submodule init\n$ git submodule update\n```\nTo build the project with your favorite C compiler `cc`, run\n```Shell\n$ cc -I deps/libaven/include -I include/ -o aven-cfmt src/aven-cfmt.c\n```\nThe project also provides a build system written in C.\nTo build the build system you can either use `make` or\n```Shell\n$ cc -o build build.c\n```\nTo build `aven-cfmt`, run `./build`.\nThe resulting binary will be located in the `build_out` directory.\nFlags may be specified with `--ccflags` and `--ldflags`\n```Shell\n$ ./build --ccflags \"-O3 -march=native -g0\" --ldflags \"-O3 -g0\"\n```\nTo run the test suite, run `./build test`.\nTo clean up all build artifacts, run `./build clean`.\nTo see a full list of available build system flags, run\n`./build help`.\n\n## Usage\n\nThe default behavior is to read from the specified `src_file` and write to `stdout`.\n```Shell\n$ aven-cfmt unformatted.c \u003e formatted.c\n```\nA full list of command line options is available in the help message.\n```Shell\n$ aven-cfmt --help\noverview: Aven C Formatter\nusage: aven-cfmt [src_file] [options]\nconfigure:\n    comments at the top of files can configure options\n        // aven cfmt columns: 128\n        // aven cfmt indent: 8\n        // aven cfmt depth: 0\n    or disable formatting\n        // aven cfmt disable\noptions:\n    --out \"str\"           output file (optional)\n    --stdin [false]       read from stdin (default=false)\n    --in-place [false]    format src_file in-place (default=false)\n    --columns N           column width, 0 for no limit (default=80)\n    --indent N            indent width (default=4)\n    --depth N             parse depth, 0 for no limit (default=40)\n    --help                show this  message\n```\n\n### In-editor formatting\n\nAs `aven-cfmt --stdin` works in the same way as `clang-format`, `astyle`, and\n`zig fmt --stdin`, it should be simple to configure it for use with any editor\nthat supports those formatters.\n\nIf you use [helix][1] like I do, you can format on save with `aven-cfmt` by adding the following\nlines to your `languages.toml` file.\n```TOML\n# ...\n[[language]]\nname = \"c\"\nformatter = { command = \"aven-cfmt\", args = [ \"--stdin\" ] }\nautoformat = true\n# ...\n```\nIf you use a (neo)vim variant, then you can format the active buffer with\n`%!aven-fmt --stdin`. With this basic command the buffer will be overwritten with the\n`stderr` error message if a parse or render error occurs, but that is simple to undo.\n\n## Performance\n\nMy benchmarks show that `aven-cfmt` formats at ~30-40MB/sec on my Intel N100 mini pc.\n```Shell\n$ lscpu | grep \"Model name\"\nModel name:                           Intel(R) N100\n$ ./build --ccflags \"-O3\" --ldflags \"\"\nclang -O3 -I deps/libaven/include -I ./include -c -o build_out/aven-cfmt.o ./src/aven-cfmt.c\nclang -o build_out/aven-cfmt build_out/aven-cfmt.o\nrm build_out/aven-cfmt.o\n$ poop \"clang-format ../raylib/src/rcore.c\" \"astyle --style=google --stdin=../raylib/src/rcore.c\" \"./build_out/aven-cfmt --columns 128 ../raylib/src/rcore.c\"\nBenchmark 1 (10 runs): clang-format ../raylib/src/rcore.c\n  measurement          mean ± σ            min … max           outliers         delta\n  wall_time           527ms ± 4.34ms     519ms …  531ms          0 ( 0%)        0%\n  peak_rss           94.3MB ±  154KB    94.1MB … 94.6MB          0 ( 0%)        0%\n  cpu_cycles         1.70G  ± 6.84M     1.69G  … 1.71G           0 ( 0%)        0%\n  instructions       3.88G  ± 1.02M     3.88G  … 3.88G           0 ( 0%)        0%\n  cache_references   39.9M  ±  395K     39.5M  … 40.7M           0 ( 0%)        0%\n  cache_misses       13.3M  ±  530K     12.6M  … 14.1M           0 ( 0%)        0%\n  branch_misses      7.76M  ± 30.6K     7.71M  … 7.81M           0 ( 0%)        0%\nBenchmark 2 (152 runs): astyle --style=google --stdin=../raylib/src/rcore.c\n  measurement          mean ± σ            min … max           outliers         delta\n  wall_time          33.0ms ±  626us    32.1ms … 38.5ms          6 ( 4%)        ⚡- 93.7% ±  0.1%\n  peak_rss           3.55MB ±  109KB    3.31MB … 3.79MB          0 ( 0%)        ⚡- 96.2% ±  0.1%\n  cpu_cycles          106M  ±  726K      105M  …  110M           6 ( 4%)        ⚡- 93.8% ±  0.1%\n  instructions        282M  ± 56.6K      282M  …  283M           1 ( 1%)        ⚡- 92.7% ±  0.0%\n  cache_references   31.7K  ± 6.92K     18.3K  … 61.9K           1 ( 1%)        ⚡- 99.9% ±  0.2%\n  cache_misses       7.76K  ± 2.18K     3.93K  … 19.5K           9 ( 6%)        ⚡- 99.9% ±  0.6%\n  branch_misses       515K  ± 16.9K      500K  …  665K          11 ( 7%)        ⚡- 93.4% ±  0.1%\nBenchmark 3 (1023 runs): ./build_out/aven-cfmt --columns 128 ../raylib/src/rcore.c\n  measurement          mean ± σ            min … max           outliers         delta\n  wall_time          4.86ms ±  623us    4.31ms … 8.00ms        112 (11%)        ⚡- 99.1% ±  0.1%\n  peak_rss           1.87MB ± 59.4KB    1.72MB … 1.97MB          0 ( 0%)        ⚡- 98.0% ±  0.0%\n  cpu_cycles         12.8M  ±  225K     12.3M  … 14.5M           6 ( 1%)        ⚡- 99.2% ±  0.0%\n  instructions       31.3M  ±  212      31.3M  … 31.3M           0 ( 0%)        ⚡- 99.2% ±  0.0%\n  cache_references   9.62K  ± 4.19K     5.41K  … 61.9K          17 ( 2%)        ⚡-100.0% ±  0.1%\n  cache_misses       1.45K  ±  629      1.04K  … 16.6K          14 ( 1%)        ⚡-100.0% ±  0.2%\n  branch_misses       136K  ± 5.67K      121K  …  150K           0 ( 0%)        ⚡- 98.2% ±  0.1%\n```\nI compiled a release build of `astyle` from upstream source, but the `clang-format` binary was from\nmy package manager.\nThis [poop][2] benchmark was only provided to show that, due to its simplicity, `aven-cfmt`\nseems to be very fast, even though I did very little deliberate optimization.\n\n## Fuzzing\n\nThe repo contains a basic [libfuzzer][3] fuzzing setup. If you have `clang` installed, then\nthe fuzzer can be compiled and run with\n```Shell\n$ ./clang_fuzz.sh\n```\nThe fuzzer runs indefinitely, halting upon encountering a crash, failed assert, sanitizer trap, or\nan input that takes longer than 2 seconds to parse and render.\n\n[1]: https://helix-editor.com/\n[2]: https://github.com/andrewrk/poop\n[3]: https://llvm.org/docs/LibFuzzer.html\n[4]: https://musl.libc.org/\n[5]: https://www.raylib.com/\n[6]: https://www.glfw.org/\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpermutationlock%2Faven-cfmt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpermutationlock%2Faven-cfmt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpermutationlock%2Faven-cfmt/lists"}