{"id":16382725,"url":"https://github.com/baderouaich/protobuf_fs_db","last_synced_at":"2026-05-16T17:03:45.530Z","repository":{"id":245011475,"uuid":"815862830","full_name":"baderouaich/protobuf_fs_db","owner":"baderouaich","description":"Filesystem based database for basic objects using protobuf","archived":false,"fork":false,"pushed_at":"2024-09-03T11:32:42.000Z","size":46,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-21T19:27:08.554Z","etag":null,"topics":["cplusplus","cplusplus-20","cpp","cpp20","database","database-design","databases","efficiency","filesystem","filesystem-database","fs","protobuf"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/baderouaich.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-16T11:19:33.000Z","updated_at":"2024-09-03T11:32:46.000Z","dependencies_parsed_at":"2024-06-19T02:20:27.746Z","dependency_job_id":"2de4552b-647c-4bb9-ae22-ec2c4a46454f","html_url":"https://github.com/baderouaich/protobuf_fs_db","commit_stats":null,"previous_names":["baderouaich/protobuf_fs_db"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/baderouaich/protobuf_fs_db","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baderouaich%2Fprotobuf_fs_db","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baderouaich%2Fprotobuf_fs_db/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baderouaich%2Fprotobuf_fs_db/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baderouaich%2Fprotobuf_fs_db/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/baderouaich","download_url":"https://codeload.github.com/baderouaich/protobuf_fs_db/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baderouaich%2Fprotobuf_fs_db/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33111497,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-16T04:41:52.686Z","status":"ssl_error","status_checked_at":"2026-05-16T04:41:52.009Z","response_time":115,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cplusplus","cplusplus-20","cpp","cpp20","database","database-design","databases","efficiency","filesystem","filesystem-database","fs","protobuf"],"created_at":"2024-10-11T04:06:10.908Z","updated_at":"2026-05-16T17:03:45.501Z","avatar_url":"https://github.com/baderouaich.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Protobuf Filesystem Database\nTrying to have an alike solution that was mentioned in this [article](https://hivekit.io/blog/how-weve-saved-98-percent-in-cloud-costs-by-writing-our-own-database/).\n\nThis is a small implementation of a filesystem based database that uses google's [protobuf](https://github.com/protocolbuffers/protobuf)\nto store objects directly to the disk using a C++ simple Database api class that provides methods\nsuch as (add, update, remove, clear, exists, count, findIf, countIf, update, and more)\n\nFor example, if your database models are a User, a Download, and a Setting, \nthe database will be structured in the hierarchy below:\n\n```txt\nDatabase/\n|_______Users/\n        |____ 1  --\u003e binary file contains User object with id = 1\n        |____ 2\n        |____ 3\n|_______Downloads/\n        |____ 1\n        |____ 2\n        |____ 3\n|_______Settings/\n        |____ 1\n        |____ 2\n        |____ 3\n```\n\nWhere each object MUST have an `id`, so the binary file where the object is stored can easily be accessed\nby id, example for User object with id = 1: User{id = 1, ...}, we already know that the file will be located at: \n`Database/Users/1` which will be fast to access (of course disk type impacts, ssd or nvme..).\n\nTo use this filesystem based database, each object must be defined in the `./proto/`folder, which you can include after building your project at least once so the \nprotobuf compiler will generate the header and source files.\n\nThe database uses a [mutex file](./MutexFile.hpp) to protect itself from multiple processes trying to write to the database at the same time, blocking the user if the database is already in use until its free then it allows the other user to proceed.\n\nThe database uses a threading mutex to protect itself from multiple threads trying to read/write to the database at the same time using an [std::shared_mutex](https://en.cppreference.com/w/cpp/thread/shared_mutex).\n\nThe [Database api](Database.hpp) allows you to manipulate your database in such way:\n\u003cdetails\u003e\n  \u003csummary\u003eExample\u003c/summary\u003e\n\n```cpp\n// Initialize the db\nDatabase db(\"/path/to/your/Database/\");\n\n// Optionally set directory name where user objects will be saved\n// The default naming will be the name was set to the protobuf objects example: \"types.User\"\n// So in below case, User objects will be saved to Database/Users\ndb.typeDirName\u003ctypes::User\u003e(\"Users\");\ndb.typeDirName\u003ctypes::Download\u003e(\"Downloads\");\ndb.typeDirName\u003ctypes::Setting\u003e(\"Settings\");\n\n// Add a new User to the database\ntypes::User user1;\nuser1.set_id(1000);\nuser1.set_name(\"James\");\nuser1.set_weight(83.15);\ndb.add(user1);\n\n// Update user1's weight to 85.0\nuser1.set_weight(85.0);\ndb.update\u003ctypes::User\u003e(user1);\n\n// Add a new Download to the database\ntypes::Download download;\ndownload.set_id(3000);\ndownload.set_userid(user1.id()); // owner\ndownload.set_timestamp(std::time(nullptr));\ndownload.set_url(\"https://youtube.com/some/video\");\ndownload.set_size(1024*1024*500);\ndownload.set_success(true);\ndb.add(download);\n\n\n// Print the count of Users and Downloads saved in the db\nstd::cout \u003c\u003c db.count\u003ctypes::User\u003e()  \u003c\u003c \" users\\n\";\nstd::cout \u003c\u003c db.count\u003ctypes::Download\u003e()  \u003c\u003c \" downloads\\n\";\n\n// Count users with weight \u003e 80.0\nstd::size_t count = db.countIf\u003ctypes::User\u003e([](const types::User\u0026 user) {\n  return user.weight() \u003e 80.0;\n});\nstd::cout \u003c\u003c \"There are \" \u003c\u003c count \u003c\u003c \" Users with weight over 80.0.\\n\";\n\n\n// Remove users with even id\nbool ok = db.removeIf\u003ctypes::User\u003e([](const types::User\u0026 user) {\n  return user.id() % 2 == 0;\n});\nassert(ok);\n\n\n// Remove all objects from database \ndb.clear\u003ctypes::User\u003e();\ndb.clear\u003ctypes::Download\u003e();\ndb.clear\u003ctypes::Setting\u003e();\n```\n\n\u003c/details\u003e\n\n\n## Inspiration\n[This article](https://hivekit.io/blog/how-weve-saved-98-percent-in-cloud-costs-by-writing-our-own-database/)\n\nA Company have saved 98% in cloud costs by writing their own database.\nThey created a purpose built, in process storage engine that’s part of the same executable as their core server.\nIt writes a minimal, delta based binary format. A single entry looks like this:\n\n![image](https://hivekit.io/blog/how-weve-saved-98-percent-in-cloud-costs-by-writing-our-own-database/byte-diagram.png)\n\nThe result: A 98% reduction in cloud cost and faster everything\n\nOf course, this is not general, some databases are too sophisticated that require a real database engine such as Microsoft SQL Server, Oracle Database, MySQL, Postgres... but this company is storing simple data structure and \nthey realized they can do better than paying thousands of dollars in AWS Aurora with the PostGIS extension for geospatial data storage.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaderouaich%2Fprotobuf_fs_db","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbaderouaich%2Fprotobuf_fs_db","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaderouaich%2Fprotobuf_fs_db/lists"}