{"id":16812159,"url":"https://github.com/romange/gaia","last_synced_at":"2025-08-31T10:40:43.073Z","repository":{"id":27557952,"uuid":"114404005","full_name":"romange/gaia","owner":"romange","description":"C++ framework for rapid server development","archived":false,"fork":false,"pushed_at":"2024-02-07T06:59:00.000Z","size":5253,"stargazers_count":77,"open_issues_count":7,"forks_count":14,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-04-02T03:12:58.874Z","etag":null,"topics":["abseil","asio","async","backend","cpp14","fibers","server"],"latest_commit_sha":null,"homepage":"https://romange.github.io/gaia/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/romange.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-12-15T19:20:34.000Z","updated_at":"2024-12-11T17:56:42.000Z","dependencies_parsed_at":"2024-10-27T11:58:01.914Z","dependency_job_id":"09a5b8f8-93f1-4f82-995d-f9051c49127b","html_url":"https://github.com/romange/gaia","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/romange/gaia","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romange%2Fgaia","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romange%2Fgaia/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romange%2Fgaia/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romange%2Fgaia/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/romange","download_url":"https://codeload.github.com/romange/gaia/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/romange%2Fgaia/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272971420,"owners_count":25024093,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-31T02:00:09.071Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["abseil","asio","async","backend","cpp14","fibers","server"],"created_at":"2024-10-13T10:20:54.256Z","updated_at":"2025-08-31T10:40:43.048Z","avatar_url":"https://github.com/romange.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Gaia - rapid backend development framework in C++\n\n[![Build Status](https://travis-ci.org/romange/gaia.svg?branch=master)](https://travis-ci.org/romange/gaia)\n=====\n\nGaia is a set of libraries and c++ environment that allows you efficient and rapid development\nin c++14 on linux systems. The focus is mostly for backend development, data processing etc.\n\n\n1. Dependency on [abseil-cpp](https://github.com/abseil/abseil-cpp/)\n2. Dependency on [Boost 1.69](https://www.boost.org/doc/libs/1_69_0/doc/html/)\n3. Uses ninja-build on top of cmake\n4. Build artifacts are docker-friendly.\n5. Generic RPC implementation.\n6. HTTP server implementation.\n7. Many other features.\n\n\nI will gradually add explanations for most crucial blocks in this library.\n\n\n## Setting Up \u0026 Building\n1. abseil is integrated as submodule. To fetch abseil run:\n\n       git submodule update --init --recursive\n\n2. Building using docker.\n   Short version - running asio_fibers example.\n   There is no need to install dependencies on a host machine.\n\n    ```bash\n       \u003e docker build -t asio_fibers -f docker/bin_build.Dockerfile --build-arg TARGET=asio_fibers\n       \u003e docker run --network host asio_fibers --logtostderr  # server process, from shell A\n       \u003e docker run --network host asio_fibers --connect=localhost --count 10000 --num_connections=4  # client process from shell B\n    ```\n\n   For a longer version please see [this document](doc/docker_build.md).\n\n3. Building on host machine directly. Currently requires Ubuntu 18.04.\n\n   ```bash\n   \u003e sudo ./install-dependencies.sh\n   \u003e ./blaze.sh -ninja -release\n   \u003e cd build-opt \u0026\u0026 ninja -j4 asio_fibers\n\n   ```\n   *third_party* folder is checked out under build directories.\n\n   Then, from 2 tabs run:\n\n   ```bash\n     server\u003e ./asio_fibers --logtostderr\n     client\u003e ./asio_fibers --connect=localhost --count 100000 --num_connections=4\n   ```\n\n\n\n## Single node Mapreduce\nGAIA library provides a very efficient multi-threaded mapreduce framework for batch processing.\nIt supports out of the box json parsing, compressed formats (gzip, zstd),\nlocal disk I/O and GCS (Google Cloud Storage). Using GAIA MR it's possible to map,\nre-shard (partition), join and group multiple sources of data very efficiently.\nFibers in GAIA allowed maximizing pipeline execution and balance IO\nwith CPU workloads in parallel. The example below shows how to process text files and re-shard them based on an imaginary \"year\" column for each CSV row. Please check out [this tutorial](doc/mr3.md) to learn more about GAIA MR.\n\n~~~~~~~~~~cpp\n#include \"absl/strings/str_cat.h\"\n#include \"mr/local_runner.h\"\n#include \"mr/mr_main.h\"\n#include \"strings/split.h\"  // For SplitCSVLineWithDelimiter.\n\nusing namespace std;\n\nDEFINE_string(dest_dir, \"~/mr_output\", \"Working dir where the pipeline writes its by products\");\n\nint main() {\n  // sets up IO threads and optional http console interace via port 8080 by default.\n  PipelineMain pm(\u0026argc, \u0026argv);\n  vector\u003cstring\u003e inputs;\n  for (int i = 1; i \u003c argc; ++i) {\n    inputs.push_back(argv[i]);  // could be a local file or \"gs://....\" url.\n  }\n  CHECK(!inputs.empty()) \u003c\u003c \"Must provide some inputs to run!\";\n\n  Pipeline* pipeline = pm.pipeline();\n\n  // Assuming that the first line of each file is csv header.\n  StringTable ss = pipeline-\u003eReadText(\"read\", inputs).set_skip_header(1);\n  auto reshard = [](string str) {\n    vector\u003cchar*\u003e cols;\n    SplitCSVLineWithDelimiter(\u0026str.front(), ',', \u0026cols);\n    return absl::StrCat(\"year-\", cols_[0]);\n  };\n\n  // Simplest example: read and repartition by year.\n  ss.Write(\"write_input\", pb::WireFormat::TXT)\n      .WithCustomSharding(reshard).AndCompress(pb::Output::ZSTD);\n\n  // Environment is abstracted away through mr3::Runner class. LocalRunner is an implementation\n  // that comes out of the box.\n  LocalRunner* runner = pm.StartLocalRunner(FLAGS_dest_dir);\n  pipeline-\u003eRun(runner);\n\n  LOG(INFO) \u003c\u003c \"Pipeline finished\";\n\n  return 0;\n}\n~~~~~~~~~~\n\n## RPC\nIn addition to great performance, this RPC supports server streaming API, fully asynchronous\nprocessing, low-latency service. GAIA RPC framework employs Boost.ASIO and Boost.Fibers\nas its core libraries for asynchronous processing.\n\n1. [IoContextPool](https://github.com/romange/gaia/blob/master/util/asio/io_context_pool.h)\nis used for managing a thread-per-core asynchronous engine based on ASIO.\nFor periodic tasks, look at `asio/period_task.h`.\n\n2. The listening server (AcceptServer) is protocol agnostic and serves both HTTP and RPC.\n\n3. RPC-service methods run inside a fiber. That fiber belongs to a thread that probably serves\nmany other fiber-based connections in the server. Using regular locking mechanisms\n(`std::mutex`, `pthread_mutex`) or calling 3rd party libraries (libmysqlcpp) will block the whole thread and all its connections will be stalled. We need to be mindful of this, and as a policy prohibit thread blocking in fiber-based server code.\n\n4. Nevertheless, RPC service methods might need to issue RPC calls by themselves or block for some other reason.\nTo do it correctly, we must use fiber-friendly synchronization routines. But even in this case,\nwe will still block the calling fiber (not thread). All other connections will continue processing but this one will stall. By default, there is one dedicated fiber per RPC connection that reads rpc requests and delegates them to the RPC application code. We need to remember that if higher level server-code stalls its fiber during its request processing, it effectively limits total QPS per that socket connection. For spinlock use-cases (i.e. RAM access locking with rw-spinlocks with low contention) having single fiber per rpc-connection is usually good enough to sustain high throughput. For more complicated cases, it's advised to implement fiber-pool (currently not exposed in GAIA).\n\n5. Server-side streaming is needed for responses that can be very large. Such responses can easily be represented by\na stream of smaller responses with an identical schema. Think of SQL response for example.\nIt may consist of many rows returned by `SELECT`. Instead, of returning all of them as one blob, server-side streaming can send back multiple responses in the context of a single request on a wire. Each small response is propagated to RPC client via a callback based interface.\nAs a result, both systems (client and server) are not required to hold the whole response in RAM at the same time.\n\nWhile GAIA provides very efficient RPC core library, it does not provide higher level RPC bindings.\nIt's possible though to build a layer that uses protobuf-based declaration language this RPC library.\nFor raw RPC demo see asio_fibers above.\n\n## HTTP\nHTTP handler is implemented using [Boost.Beast](https://www.boost.org/doc/libs/1_68_0/libs/beast/doc/html/index.html) library.\nIt's integrated with the IoContextPool similarly to RPC service.\nPlease see [http_main.cc](https://github.com/romange/gaia/blob/master/util/http/http_main.cc), for example. HTTP also provides support for backend monitoring (Varz status page) and for extensible debugging interface. With monitoring C++ backend returns json object that is formatted inside status page in the browser. To check how it looks, please go to [localhost:8080](http://localhost:8080) while `asio_fibers` are running.\n\n\n### Self-profiling\nEvery http-powered backend has integrated CPU profiling capabilities using [gperf-tools](https://github.com/gperftools/gperftools) and [pprof](https://github.com/google/pprof)\nProfiling can be trigerred in prod using magic-url commands. Enabled profiling usually has very minimal impact\non cpu performance of the running backend.\n\n### Logging\nLogging is based on Google's [glog library](https://github.com/google/glog). The library is very reliable, performant and solid. It has many features that allow resilient backend development.\nUnfortunately, Google's version has some bugs, which I fixed (waiting for review...), so I use my own fork. Glog library gives me the ability to control logging levels of a backend at run-time without restarting it.\n\n## Tests\nGAIA uses googletest+gmock unit-test environment.\n\n## Conventions\nTo use abseil code use `#include \"absl/...\"`.\nThird_party packages have `TRDP::` prefix in `CMakeLists.txt`. absl libraries have prefix\n`absl_...`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fromange%2Fgaia","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fromange%2Fgaia","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fromange%2Fgaia/lists"}