{"id":16352316,"url":"https://github.com/bugthesystem/distrox","last_synced_at":"2025-08-28T05:37:33.136Z","repository":{"id":41371201,"uuid":"299576162","full_name":"bugthesystem/distrox","owner":"bugthesystem","description":"A fast thread-safe in-memory cache (storage) library and server that supports a big number of entries in Go","archived":false,"fork":false,"pushed_at":"2021-09-02T10:40:15.000Z","size":111,"stargazers_count":37,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-11T02:08:40.326Z","etag":null,"topics":["cache","golang","in-memory","performance"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bugthesystem.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-09-29T10:01:06.000Z","updated_at":"2023-09-08T18:12:59.000Z","dependencies_parsed_at":"2022-07-30T07:17:53.007Z","dependency_job_id":null,"html_url":"https://github.com/bugthesystem/distrox","commit_stats":null,"previous_names":["bugthesystem/distrox"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bugthesystem%2Fdistrox","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bugthesystem%2Fdistrox/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bugthesystem%2Fdistrox/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bugthesystem%2Fdistrox/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bugthesystem","download_url":"https://codeload.github.com/bugthesystem/distrox/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244094274,"owners_count":20397020,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cache","golang","in-memory","performance"],"created_at":"2024-10-11T01:25:41.936Z","updated_at":"2025-03-21T00:31:06.375Z","avatar_url":"https://github.com/bugthesystem.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"Distrox\n==========\nA fast thread-safe in-memory cache library and server that supports a big number of entries in Go\n\n\u003e It can be used as a standalone server or imported as a separate package.\n\n## Example as package\n```go\nimport (\n  //... omitted for brevity\n\t\"github.com/ziyasal/distroxy/pkg/distrox\"\n)\n\n//... omitted for brevity\n\nlogger := common.NewZeroLogger(config.app.mode)\ncache, err := distrox.NewCache(\n\tdistrox.WithMaxBytes(config.cache.maxBytes),\n\tdistrox.WithShards(config.cache.shards),\n\tdistrox.WithMaxKeySize(config.cache.maxKeySizeInBytes),\n\tdistrox.WithMaxValueSize(config.cache.maxValueSizeInBytes),\n\tdistrox.WithTTL(config.cache.ttlInSeconds),\n\tdistrox.WithLogger(logger),\n\tdistrox.WithStatsEnabled(),\n)\n  \n // use cache here\n```\n\n## Running\n```sh\nmake run\n```\n\n## Tests\n```\nmake test\n```\n\n## Benchmark\nBenchmarks run against \n- raw `sync.Map` (only raw get, set) - \n`Get` performance is baseline here for other implementations\n\n- Cache implementation using `sync.Map`\n- Custom sharded cache implementation using ring buffer\n\nThere are more optimizations to be made to make read faster for example `3X.XXMB/s`.\n\n**Specs**  \nProcessor: 2.4 GHz Quad-Core Intel Core i5  \nMemory   : 16 GB 2133 MHz LPDDR3  \nmacOS Catalina 10.15.6\ngo1.15.2 darwin/amd64\n\n```\n❯ make bench\nGOMAXPROCS=4 go test ./pkg/distrox/ -bench='Set|Get' -benchtime=10s\ngoos: darwin\ngoarch: amd64\npkg: github.com/ziyasal/distroxy/pkg/distrox\nBenchmarkSyncMapSet-4              \t     303\t  37976140 ns/op\t   3.45 MB/s\t 6846491 B/op\t  524728 allocs/op\nBenchmarkSyncMapGet-4              \t    4905\t   2537302 ns/op\t  51.66 MB/s\t    3286 B/op\t     134 allocs/op\nBenchmarkSyncMapSetGet-4           \t    1078\t   9832155 ns/op\t  20.34 MB/s\t 5208678 B/op\t  400123 allocs/op\n\nBenchmarkSyncMapCacheSet-4         \t     270\t  44550028 ns/op\t   2.94 MB/s\t 8947434 B/op\t  524782 allocs/op\nBenchmarkSyncMapCacheGet-4         \t    3793\t   3180366 ns/op\t  20.61 MB/s\t    3632 B/op\t     104 allocs/op\n\nBenchmarkDistroxCacheSetBin-4      \t    1485\t   7531564 ns/op\t  17.40 MB/s\t  257319 B/op\t      27 allocs/op\nBenchmarkDistroxCacheGetBin-4      \t    1909\t   5641515 ns/op\t  23.23 MB/s\t   28658 B/op\t      11 allocs/op\nBenchmarkDistroxCacheSetGetBin-4   \t     787\t  15250764 ns/op\t  17.19 MB/s\t  485540 B/op\t      50 allocs/op\nPASS\nok  \tgithub.com/ziyasal/distroxy/pkg/distrox\t122.740s\n```\n\n## Load test\nRun `sudo launchctl limit maxfiles 65535 65535` command to increase defaults in case needed.\n\n```sh\npip3 install locust\nlocust -f scripts/distrox_locust.py --users 1000 --spawn-rate 100\n\n#headless\nlocust -f scripts/distrox_locust.py  --headless --users 1000 --spawn-rate 100 --run-time 5m\n```\n\n## Design Notes\nThe cache is sharded and has its own locks thus the time spent is reduced\nwhile waiting for locks. Each shard has a map with [1]`hash(key) → packed(position((ts, key, value)), fragmented-flag)`\nin the ring buffer, and the ring buffer has 64 KB-size (for having a low-fragmentation) byte slices occupied\nby encoded (ts, key, value) entries.\n\n- [1] - uint64 =\u003e  63bits for position and last 1bit for the fragmented flag\n\nThere are two cases considered in terms of entry size; \n### Entries fit into default mem-block (64KB)\n```sh\n|---------------------|-------------------|---------------------|-----------|-------------|\n| timestamp bytes — 8 | key len bytes — 2 | value len bytes — 2 | key bytes | value bytes |\n|---------------------|-------------------|---------------------|-----------|-------------|\n```\n\n### Entries don't fit into default mem-block\nFor the big entries (k + v + headers \u003e 64 KB), the below approach implemented:\n* Split entry into smaller fragments where it can fit into the default memory-block (64KB in our case)\n* Calculate the key for each fragment by using fragment index and the value hash and\nstore the fragment in the cache with the calculated key\n* Store the value-hash, and the value-length as a new value (meta-value) with the actual key\n(when the entry requested, the stored value (meta-value) will be processed to find out \nthe fragments of the actual value)\n* Fragmented entry flag for the \"meta entry\" is set to true (it's \"false\" for non-fragmented entries). \nThen the flag checked to determine whether processing the entry value required \nor not to collect parts of the actual entry value.\n\nOne caveat, storing big entries might require setting a bigger cache sizes to prevent overwriting existing entries.\n\n- Time api (`time.Now`) cached in the clock component and updated every second, this eliminates calls to time api.\n\n### Eviction options\n- Cleanup job\n- Evict on get\n\nCurrently, the `evict on get` approach implemented (also, entries\nevicted from the cache on cache size overflow).\n\nEach entry has time created timestamp encoded, the timestamp then\nchecked whether is life window exceeded or not when access to the\nentry happened.  Its deleted from the index map if its lifetime\nexceeded, but not from memory.\n\n\n### Cache Persistence (planned)\nPersistence is not implemented yet, but I'm going to discuss how it can be implemented below.\n\n**There are a few persistence options could be considered;**  \n- Cache DB persistence performs point-in-time snapshots of the data-set at specified intervals.\n- The AOF persistence logs every write operation received by the server, that will be played again at server startup,\nreconstructing the original data-set\n- Combine both AOF and Cache DB in the same instance, in this case, when the server restarts\nthe AOF file can be used to reconstruct the original data-set since it is guaranteed to be the most complete.\n\nThe following binary format could be used if the first option would be implemented.\n**Cache DB Binary Format**  \n\n```sh\n----------------------------# CDB is a binary format, without new lines or spaces in the file.\n44 49 53 54 52 4f 58        # Magic String \"DISTROX\"\n30 30 30 31                 # 4 digit ASCI CDB Version Number. In this case, version = \"0001\" = 1\n4 bytes                     # Integer DB entry count, high byte first\n----------------------------\nrepeating {\n  $ts-bytes\n  $key-bytes-length\n  $value-bytes-length\n  $key-bytes\n  $value-bytes\n}\n----------------------------\n8 byte checksum             # CRC 64 checksum of the entire file.\n```\n\n## Limitations\n- Max cache size should be set when it gets initialized\n- Since its uses fixed-size ring buffer on each shard, data will be overwritten when the ring is full\n- Each mem-block in the ring buffer is 64 KB mem-size to have a low-fragmentation\n\n## Features out of scope\n - Clustering\n - Versioning (`VectorClock` could be used here)\n\n## Improvements - planned\n- Memory blocks could be allocated off-heap\n (`mmap syscall` could be used to access the mapped memory as an array of bytes `[1]`) to prevent taking\n cache size into account by GOGC, and it can be pooled too.\n\n`[1]` - This should be carefully done - if the array is referenced even after\n the memory region is unmapped, this can lead to a segmentation fault\n- Export server metrics as Prometheus metrics\n- Export cache stats as part of Prometheus metrics (currently its served from `/stats` endpoint)\n- Add more tests \n   * cover cache edge cases, \n   * cover more server cases   \n   * load test scenarios\n- Add deployment configurations (Dockerfile, Helm chart etc)\n- Compression support could be added to the server (ie: gzip, brotli)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbugthesystem%2Fdistrox","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbugthesystem%2Fdistrox","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbugthesystem%2Fdistrox/lists"}