{"id":15107768,"url":"https://github.com/vkcom/nocc","last_synced_at":"2025-04-12T18:49:35.196Z","repository":{"id":62105239,"uuid":"491079626","full_name":"VKCOM/nocc","owner":"VKCOM","description":"A distributed C++ compiler: like distcc, but faster","archived":false,"fork":false,"pushed_at":"2025-02-06T16:39:00.000Z","size":1189,"stargazers_count":279,"open_issues_count":13,"forks_count":13,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-04-03T22:08:28.684Z","etag":null,"topics":["compiler","distcc","kphp"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VKCOM.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-05-11T11:23:55.000Z","updated_at":"2025-03-28T19:23:53.000Z","dependencies_parsed_at":"2024-06-26T14:33:49.502Z","dependency_job_id":"aa10fffd-e760-4672-9dda-9816af73dff9","html_url":"https://github.com/VKCOM/nocc","commit_stats":{"total_commits":11,"total_committers":3,"mean_commits":"3.6666666666666665","dds":"0.18181818181818177","last_synced_commit":"b1f7bb3903322fab16412d2d332358264ce2a290"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VKCOM%2Fnocc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VKCOM%2Fnocc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VKCOM%2Fnocc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VKCOM%2Fnocc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VKCOM","download_url":"https://codeload.github.com/VKCOM/nocc/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248618218,"owners_count":21134199,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compiler","distcc","kphp"],"created_at":"2024-09-25T21:41:33.330Z","updated_at":"2025-04-12T18:49:35.173Z","avatar_url":"https://github.com/VKCOM.png","language":"Go","readme":"# nocc — a distributed C++ compiler\n\n`nocc` propagates a compiler invocation to a remote machine: `nocc g++ 1.cpp` calls `g++` remotely, not locally.\n\n`nocc` speeds up compilation of large C++ projects: when you have multiple remotes, tons of local jobs are parallelized between them.\n\nBut its most significant effort is greatly speeding up re-compilation across build agents in CI/CD and across developers working on the same project: \nthey use shared remote caches. \nOnce a cpp file has been compiled, the resulting obj is used by other agents without launching compilation, actually.\n\n`nocc` easily integrates into any build system, since a build system should only prefix executing commands.\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## The reason why nocc was created\n\n`nocc` was created at VK.com to speed up KPHP compilation.\n[KPHP](https://github.com/VKCOM/kphp) is a PHP compiler: it converts PHP sources to C++. \nVK.com codebase is huge, for how we have about **150 000** autogenerated cpp files.\n\nOur goal was to greatly improve the performance of the *\"C++ → binary\"* step.\n\nSince 2014, we used [distcc](https://github.com/distcc/distcc).   \nIn 2019, we patched distcc to support precompiled headers. That gave us 5x to performance.   \nIn 2021, we decided to implement a distcc replacement. Finally, we got 2x – 9x over the patched version.\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## Installation and configuration\n\nThe easiest way is just to download ready binaries — proceed to the [releases page](https://github.com/VKCOM/nocc/releases)\nand download the latest `.tar.gz` for your system: you'll have 3 binaries after extracting.\n\nYou can also compile `nocc` from sources, see the [installation page](./docs/installation.md).\n\nFor a test launch (to make sure that everything works), proceed to [this section](./docs/installation.md#run-a-simple-example-locally).\n\nFor a list of command-line arguments and environment variables, visit the [configuration page](./docs/configuration.md).\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## How does nocc work\n\nConsider the following file named `1.cpp`:\n\n```cpp\n#include \"1.h\"\n\nint square(int a) { \n  return a * a; \n}\n```\n\nHaving `1.h` be just like\n\n```cpp\nint square(int a);\n```\n\nWhen you run `nocc g++ 1.cpp -o 1.o -c`, the compilation is done remotely:\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"docs/img/nocc-one-file.drawio.png\" alt=\"one file\" height=\"201\"\u003e\n\u003c/p\u003e\n\nWhat's actually happening here:\n\n* `nocc` parses the command-line invocation: input files, include dirs, cxx flags, etc.\n* for an input file (`1.cpp`), `nocc` finds all dependencies: it traverses all `#include` recursively (which results in just one file `1.h` here)\n* `nocc` uploads files to a server and waits\n* `nocc-server` executes the same command-line (same cxx flags, but modified paths)\n* `nocc-server` pushes a compiled object file back\n* `nocc` saves `1.o` — the same as if compiled locally\n\nBesides an object file, `nocc-server` pushes *exitCode/stdout/stderr* of the C++ compiler: `nocc` process uses them as a self output.\n\n### In production, you have multiple compilation servers\n\nConceptually, you can think of a working scheme like this:\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"docs/img/nocc-many-files.drawio.png\" alt=\"many files\" height=\"356\"\u003e\n\u003c/p\u003e\n\nLots of `nocc` processes are launched simultaneously — much more than you could launch if you use g++ locally.\n\nEvery `nocc` invocation handles exactly one `.cpp -\u003e .o` compilation, it's by design. \nIt does remote compilation and dies — `nocc` is just a front-end layer between any build system and a real C++ compiler.\n\nFor every invocation, a remote server is chosen, all dependencies are detected, missing dependencies are uploaded,\nand the server streams back a ready obj file. \nThis happens in parallel for all command lines.\n\nActually, to be more efficient, all connections are served via one background **nocc-daemon**:\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"docs/img/nocc-daemon.drawio.png\" alt=\"daemon\" height=\"356\"\u003e\n\u003c/p\u003e\n\n`nocc-daemon` is written in Go, whereas `nocc` is a very lightweight C++ wrapper, \nthe only aim of which is to pipe command-line to a daemon, wait for the response, and die.\n\nSo, a final working scheme is the following:\n\n1) The very first `nocc` invocation starts `nocc-daemon`:\n   a daemon serves grpc connections and actually does all stuff for remote compilation.\n2) Every `nocc` invocation pipes a command-line (`g++ ...`) to a daemon via Unix socket, a daemon compiles it remotely and\n   writes the resulting .o file, then `nocc` process dies.\n3) `nocc` jobs start and die: a build system executes and balances them.\n4) `nocc-daemon` dies in 15 seconds after `nocc` stops connecting (after the compilation process finishes).\n\nFor more info, consider the [nocc architecture page](./docs/architecture.md).\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## nocc is also a remote src/obj cache\n\nThe main idea behind `nocc` is that **the 2nd, the 3rd, the Nth runs are faster than the first**. \nEven if you clean a build directory, even on another machine, even in a renamed folder.\n\nThat's because of remote caches.   \n`nocc` does not upload files if they have already been uploaded — that's the **src cache**.   \n`nocc` does not compile files if they have already been compiled — that's the **obj cache**. \n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"docs/img/nocc-second-run.drawio.png\" alt=\"second run\" height=\"255\"\u003e\n\u003c/p\u003e\n\nSuch an approach dramatically decreases compilation times if your CI has different build machines or your builds start from a fresh copy. \nMoreover, git branch switching and merging is also a great target for remote caching.\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## nocc and CMake\n\nWhen CMake generates a buildfile for your C++ project, you typically launch the build process with `make` or `ninja`.\nThese build systems launch and balance processes and keep doing it until all C++ files are compiled.\n\nOur goal is to tell CMake to launch `nocc g++` instead of `g++` (or any other C++ compiler). This can be done\nwith `-DCMAKE_CXX_COMPILER_LAUNCHER`:\n\n```bash\ncmake -DCMAKE_CXX_COMPILER_LAUNCHER=/path/to/nocc ..\n```\n\nThen `make` building would look like this:\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"docs/img/nocc-make%20-j.drawio.png\" alt=\"make\" height=\"301\"\u003e\n\u003c/p\u003e\n\nCMake sometimes invokes the C++ compiler with `-MD/-MT` flags to generate a dependency list. \n`nocc` supports them out of the box, depfiles are generated on a client-side.\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## nocc and ninja\n\n[Ninja](https://ninja-build.org/) is a build system, easily integrated to CMake instead of `make`.\n\n`nocc` works with `ninja`, but there are 2 points to care about:\n1. Explicitly set `-j {jobs}` (typically, you don't do this with `ninja`, then it automatically spreads jobs across machine CPUs, but we need *{jobs}* to be a huge number).\n2. There is an upsetting defect that (whyever) `ninja` incrementally waits for a daemon to die. A workaround is to launch a daemon manually in advance. [Read more](./docs/ninja-problem.md) about this problem.\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## nocc and KPHP\n\nOriginally, `nocc` was created to speed up compiling large KPHP projects, with lots of autogenerated C++ files.\nKPHP does not call `make`: it has a build system right inside itself.\n\nTo use `nocc` with KPHP, just set the `KPHP_CXX=nocc g++` environment variable. \nThen `nocc` will be used for both C++ compilation and precompiled headers generation.\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## Precompiled headers support\n\n`nocc` treats precompiled headers in a special way. When a client command to generate pch is executed,\n```bash\nnocc g++ -x c++-header -o all-headers.h.gch all-headers.h\n```\n\nthen `nocc` emits `all-headers.h.nocc-pch`, whereas `all-headers.h.gch` is **not produced** at all. \nThis is a text file containing all dependencies — compiled on a server-side into a real `.gch/.pch`.\n\nGenerating a `.nocc-pch` file is much faster than generating a real precompiled header, so it's acceptable to call it for every build — anyway, it will be compiled remotely only once.\n\nHere you can [read more](./docs/architecture.md#own-precompiled-headers) about own precompiled headers.\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## nocc vs ccache\n\nIt's quite incorrect to compare `nocc` with `ccache`, as `ccache` is not intended to parallelize compilation on remotes.\n`ccache` can speed up compilation performed locally (especially useful when you switch git branches), \nbut when it comes to compiling a huge number of C++ files from scratch, everything is still done locally.\n\n`nocc` also greatly speeds up re-compilation when switching branches. But `nocc` does it in a completely different\nideological way: using remote caches.\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## nocc vs distcc\n\nBecause `nocc` was targeted as a distcc replacement, a detailed analysis of their differences is written on\nthe [compare with distcc page](./docs/compare-with-distcc.md).\n\nThat page includes an architecture overview, some info about patching distcc with pch support, \nand real build times from VK.com production.\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## What makes nocc so fast\n\n`nocc` architecture is specially tuned to be as fast as possible for typical usage scenarios.\n\n* `nocc-daemon` keeps all connections alive, while `nocc` processes start and die during a build\n* to resolve all recursive `#include`, `nocc` does not invoke preprocessor: it uses its own parser instead\n* `nocc-server` has the src cache: once `1.h` is uploaded by any client, no other clients need to upload this file again (unless changed)\n* `nocc-server` has the obj cache: once `1.cpp` is compiled by any client, all other clients receive `1.o` without compilation (if all dependencies and flags match)\n* for a `file.cpp`, one and the same server is chosen every time to make remote caches reusable\n* shared precompiled headers: once `1.gch` compiled, no other build agents have to do it locally\n\n[**Dig deeper into nocc architecture**](./docs/architecture.md)\n\n\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\n## FAQ\n\n**What are the conditions to make sure that a remote .o file would equal a local .o?**\n\n`nocc` assumes that all remotes have the C++ compiler of exactly the same version as local. \nThat would ensure no difference, where exactly the compilation was launched if we have equal source files. \nSince linking is done locally, remotes are not required to have all libs needed for linking. \n\n**What if I #include \u003cre2.h\u003e but it doesn't exist on remote?**\n\nEverything would still work. \nWhen `nocc` traverses dependencies, it also finds all system headers recursively, their hash sums are sent to the remote along with the cpp file info. \nIf some system includes are missing (or if they differ from local ones), they are also sent like regular files, \nsaved to the `/tmp` folder representing client file structure, and discovered via special `-isystem` arguments added to the command-line.\n\n**How does nocc handle linking commands?**\n\nLinking is done locally. All commands that are unsupported or non-well-formed are done locally.\n\n**What happens if some servers are unavailable?**\n\nWhen `nocc` tries to compile `1.cpp` remotely, but the server is unavailable, `nocc` falls back to local compilation. \nIt does not try another server, it's [intentionally](./docs/architecture.md#local-fallback-queue). \n\n**Does nocc support clang?**\n\nTheoretically, there should be no difference, what compiler is being used: `g++`, or `clang++`, or `/usr/bin/c++`, etc.\nEven `.pch` files are supposed to work, as pch compilation is done remotely. \nSmall tests for clang work well, but it hasn't been tested well in production, as we use only `g++` in KPHP and VK.com for now.\n\n**What is the optimal job/server count?**\n\nThe final number that we fixated at VK.com is *\"launch ~20 jobs for one server\"*. \nFor example, we have 32 compilation servers, and we launch ~600 jobs for C++ compilation. \nThis works well both when files are compiled and when they are just taken from obj cache. \nNote, that if you use a large number of parallel jobs, you'd probably have to increase `ulimit -n`, \nas `nocc-daemon` reads lots of files and keeps all connections to `nocc` C++ wrappers simultaneously.\n\n**I get an error \"compiling locally: rpc error: code = Unknown desc = file xxx.cpp was already uploaded, but now got another sha256 from client\"**\n\nThis error occurs in such a scenario: you compile a file, they quickly modify it, and launch compilation again — a previous `nocc-daemon` is still running, previous file structure is still mapped to servers. Then the compilation for such file is done locally. In reality, such an error never occurs, as big projects take some time for linking/finalization after compilation (a daemon dies in 15 seconds).\n\n**Why did you name this tool \"nocc\"?**\n\nWe already have a PHP linter named [noverify](https://github.com/VKCOM/noverify) \nand an architecture validation tool [nocolor](https://github.com/VKCOM/nocolor). \nThat's why \"nocc\" — just because I like such naming :)\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvkcom%2Fnocc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvkcom%2Fnocc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvkcom%2Fnocc/lists"}