{"id":13411789,"url":"https://github.com/elastic/go-freelru","last_synced_at":"2025-05-16T06:05:42.984Z","repository":{"id":147190976,"uuid":"583307733","full_name":"elastic/go-freelru","owner":"elastic","description":null,"archived":false,"fork":false,"pushed_at":"2025-05-08T08:36:23.000Z","size":138,"stargazers_count":236,"open_issues_count":14,"forks_count":16,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-05-12T00:17:38.260Z","etag":null,"topics":["cache","data-structures","gc-less","go","golang","library","lru"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/elastic.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-12-29T11:42:46.000Z","updated_at":"2025-05-10T09:21:29.000Z","dependencies_parsed_at":"2024-01-28T17:41:05.387Z","dependency_job_id":"804ef531-202d-48bf-ab21-d8fd0caab2bc","html_url":"https://github.com/elastic/go-freelru","commit_stats":{"total_commits":97,"total_committers":6,"mean_commits":"16.166666666666668","dds":0.2680412371134021,"last_synced_commit":"a34f16522651bbb14dd22ae55eb724de56f93e3d"},"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elastic%2Fgo-freelru","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elastic%2Fgo-freelru/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elastic%2Fgo-freelru/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elastic%2Fgo-freelru/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/elastic","download_url":"https://codeload.github.com/elastic/go-freelru/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254478188,"owners_count":22077676,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cache","data-structures","gc-less","go","golang","library","lru"],"created_at":"2024-07-30T20:01:16.941Z","updated_at":"2025-05-16T06:05:42.920Z","avatar_url":"https://github.com/elastic.png","language":"Go","funding_links":[],"categories":["Data Integration Frameworks","Go","Database","数据库"],"sub_categories":["Caches","缓存"],"readme":"[![Go Reference](https://pkg.go.dev/badge/github.com/elastic/go-freelru.svg)](https://pkg.go.dev/github.com/elastic/go-freelru)\n[![Go Report Card](https://goreportcard.com/badge/github.com/elastic/go-freelru)](https://goreportcard.com/report/github.com/elastic/go-freelru)\n[![Coverage Status](https://coveralls.io/repos/github/elastic/go-freelru/badge.svg?branch=main)](https://coveralls.io/github/elastic/go-freelru?branch=main)\n[![Mentioned in Awesome Go](https://awesome.re/mentioned-badge.svg)](https://github.com/avelino/awesome-go)\n\n\n# FreeLRU - A GC-less, fast and generic LRU hashmap library for Go\n\nFreeLRU allows you to cache objects without introducing GC overhead.\nIt uses Go generics for simplicity, type-safety and performance over interface types.\nIt performs better than other LRU implementations in the Go benchmarks provided.\nThe API is simple in order to ease migrations from other LRU implementations.\nThe function to calculate hashes from the keys needs to be provided by the caller.\n\n## `LRU`: Single-threaded LRU hashmap\n\n`LRU` is a single-threaded LRU hashmap implementation.\nIt uses a fast exact LRU algorithm and has no locking overhead.\nIt has been developed for low-GC overhead and type-safety.\nFor thread-safety, pick one of `SyncedLRU` or `ShardedLRU` or do locking by yourself.\n\n### Comparison with other single-threaded LRU implementations\nGet (key and value are both of type `int`)\n```\nBenchmarkFreeLRUGet              73456962                15.17 ns/op           0 B/op          0 allocs/op\nBenchmarkSimpleLRUGet            91878808                12.09 ns/op           0 B/op          0 allocs/op\nBenchmarkMapGet                 173823274                6.884 ns/op           0 B/op          0 allocs/op\n```\nAdd\n```\nBenchmarkFreeLRUAdd_int_int             39446706                30.04 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_int_int128          39622722                29.71 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_uint32_uint64       43750496                26.97 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_string_uint64       25839464                39.31 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_int_string          37269870                30.55 ns/op            0 B/op          0 allocs/op\n\nBenchmarkSimpleLRUAdd_int_int           12471030                86.33 ns/op           48 B/op          1 allocs/op\nBenchmarkSimpleLRUAdd_int_int128        11981545                85.70 ns/op           48 B/op          1 allocs/op\nBenchmarkSimpleLRUAdd_uint32_uint64     11506755                87.52 ns/op           48 B/op          1 allocs/op\nBenchmarkSimpleLRUAdd_string_uint64      8674652               142.8 ns/op            49 B/op          1 allocs/op\nBenchmarkSimpleLRUAdd_int_string        12267968                87.77 ns/op           48 B/op          1 allocs/op\n\nBenchmarkMapAdd_int_int                 34951609                48.08 ns/op            0 B/op          0 allocs/op\nBenchmarkMapAdd_int_int128              31082216                47.05 ns/op            0 B/op          0 allocs/op\nBenchmarkMapAdd_uint32_uint64           36277005                48.08 ns/op            0 B/op          0 allocs/op\nBenchmarkMapAdd_string_uint64           29380040                49.37 ns/op            0 B/op          0 allocs/op\nBenchmarkMapAdd_int_string              30325861                47.35 ns/op            0 B/op          0 allocs/op\n```\n\nThe comparison with Map is just for reference - Go maps don't implement LRU functionality and thus should\nbe significantly faster than LRU implementations.\n\n## `SyncedLRU`: Concurrent LRU hashmap for low concurrency.\n\n`SyncedLRU` is a concurrency-safe LRU hashmap implementation wrapped around `LRU`.\nIt is best used in low-concurrency environments where lock contention isn't a thing to worry about.\nIt uses an exact LRU algorithm.\n\n## `ShardedLRU`: Concurrent LRU hashmap for high concurrency\n\n`ShardedLRU` is a sharded, concurrency-safe LRU hashmap implementation.\nIt is best used in high-concurrency environments where lock contention is a thing.\nDue to the sharded nature, it uses an approximate LRU algorithm.\n\nFreeLRU is for single-threaded use only.\nFor thread-safety, the locking of operations needs to be controlled by the caller.\n\n### Comparison with other multithreaded LRU implementations\nAdd with `GOMAXPROCS=1`\n```\nBenchmarkParallelSyncedFreeLRUAdd_int_int128    42022706                28.27 ns/op            0 B/op          0 allocs/op\nBenchmarkParallelShardedFreeLRUAdd_int_int128   35353412                33.33 ns/op            0 B/op          0 allocs/op\nBenchmarkParallelFreeCacheAdd_int_int128        14825518                79.58 ns/op            0 B/op          0 allocs/op\nBenchmarkParallelRistrettoAdd_int_int128         5565997               206.1 ns/op           121 B/op          3 allocs/op\nBenchmarkParallelPhusluAdd_int_int128           28041186                41.26 ns/op            0 B/op          0 allocs/op\nBenchmarkParallelCloudflareAdd_int_int128        6300747               185.0 ns/op            48 B/op          2 allocs/op\n```\nAdd with `GOMAXPROCS=1000`\n```\nBenchmarkParallelSyncedFreeLRUAdd_int_int128-1000               12251070               138.9 ns/op             0 B/op          0 allocs/op\nBenchmarkParallelShardedFreeLRUAdd_int_int128-1000              112706306               10.59 ns/op            0 B/op          0 allocs/op\nBenchmarkParallelFreeCacheAdd_int_int128-1000                   47873679                24.14 ns/op            0 B/op          0 allocs/op\nBenchmarkParallelRistrettoAdd_int_int128-1000                   69838436                16.93 ns/op          104 B/op          3 allocs/op\nBenchmarkParallelOracamanMapAdd_int_int128-1000                 25694386                40.48 ns/op           37 B/op          0 allocs/op\nBenchmarkParallelPhusluAdd_int_int128-1000                      89379122                14.19 ns/op            0 B/op          0 allocs/op\n```\n`Ristretto` offloads the LRU functionality of `Add()` to a separate goroutine, which is why it is relatively fast. But the\nseparate goroutine doesn't show up in the benchmarks, so the numbers are not directly comparable.\n\n`Oracaman` is not an LRU implementation, just a thread-safety wrapper around `map`.\n\nGet with `GOMAXPROCS=1`\n```\nBenchmarkParallelSyncedGet      43031780                27.35 ns/op            0 B/op          0 allocs/op\nBenchmarkParallelShardedGet     51807500                22.86 ns/op            0 B/op          0 allocs/op\nBenchmarkParallelFreeCacheGet   21948183                53.52 ns/op           16 B/op          1 allocs/op\nBenchmarkParallelRistrettoGet   30343872                33.82 ns/op            7 B/op          0 allocs/op\nBenchmarkParallelBigCacheGet    21073627                51.08 ns/op           16 B/op          2 allocs/op\nBenchmarkParallelPhusluGet      59487482                20.02 ns/op            0 B/op          0 allocs/op\nBenchmarkParallelCloudflareGet  17011405                67.11 ns/op            8 B/op          1 allocs/op\n```\nGet with `GOMAXPROCS=1000`\n```\nBenchmarkParallelSyncedGet-1000                 10867552               151.0 ns/op             0 B/op          0 allocs/op\nBenchmarkParallelShardedGet-1000                287238988                4.061 ns/op           0 B/op          0 allocs/op\nBenchmarkParallelFreeCacheGet-1000              78045916                15.33 ns/op           16 B/op          1 allocs/op\nBenchmarkParallelRistrettoGet-1000              214839645                6.060 ns/op           7 B/op          0 allocs/op\nBenchmarkParallelBigCacheGet-1000               163672804                7.282 ns/op          16 B/op          2 allocs/op\nBenchmarkParallelPhusluGet-1000                 200133655                6.039 ns/op           0 B/op          0 allocs/op\nBenchmarkParallelCloudflareGet-1000             100000000               11.26 ns/op            8 B/op          1 allocs/op\n```\n`Cloudflare` and `BigCache` only accept `string` as the key type.\nSo the ser/deser of `int` to `string` is part of the benchmarks for a fair comparison\n\nHere you can see that `SyncedLRU` badly suffers from lock contention.\n`ShardedLRU` is ~37x faster than `SyncedLRU` in a high-concurrency situation and the second\nfastest LRU implementation (`Ristretto` and `Phuslu`) is 50% slower.\n\n### Merging hashmap and ringbuffer\n\nMost LRU implementations combine Go's `map` for the key/value lookup and their own implementation of\na circular doubly-linked list for keeping track of the recent-ness of objects.\nThis requires one additional heap allocation for the list element. A second downside is that the list\nelements are not contiguous in memory, which causes more (expensive) CPU cache misses for accesses.\n\nFreeLRU addresses both issues by merging hashmap and ringbuffer into a contiguous array of elements.\nEach element contains key, value and two indices to keep the cached objects ordered by recent-ness.\n\n### Avoiding GC overhead\n\nThe contiguous array of elements is allocated on cache creation time.\nSo there is only a single memory object instead of possibly millions that the GC needs to iterate during\na garbage collection phase.\nThe GC overhead can be quite large in comparison with the overall CPU usage of an application.\nEspecially long-running and low-CPU applications with lots of cached objects suffer from the GC overhead.\n\n### Type safety by using generics\n\nUsing generics allows type-checking at compile time, so type conversions are not needed at runtime.\nThe interface type or `any` requires type conversions at runtime, which may fail.\n\n### Reducing memory allocations by using generics\n\nThe interface types (aka `any`) is a pointer type and thus require a heap allocation when being stored.\nThis is true even if you just need an integer to integer lookup or translation.\n\nWith generics, the two allocations for key and value can be avoided: as long as the key and value types do not contain\npointer types, no allocations will take place when adding such objects to the cache.\n\n### Overcommitting of hashtable memory\n\nEach hashtable implementation tries to avoid hash collisions because collisions are expensive.\nFreeLRU allows allocating more elements than the maximum number of elements stored.\nThis value is configurable and can be increased to reduce the likeliness of collisions.\nThe performance of the LRU operations will generally become faster by doing so.\nSetting the size of LRU to a value of 2^N is recognized to replace slow divisions by fast bitwise AND operations.\n\n## Benchmarks\n\nBelow we compare FreeLRU with SimpleLRU, FreeCache and Go map.\nThe comparison with FreeCache is just for reference - it is thread-safe and comes with a mutex/locking overhead.\nThe comparison with Go map is also just for reference - Go maps don't implement LRU functionality and thus should\nbe significantly faster than FreeLRU. It turns out, the opposite is the case.\n\nThe numbers are from my laptop (Intel(R) Core(TM) i7-12800H @ 2800 MHz).\n\nThe key and value types are part of the benchmark name, e.g. `int_int` means key and value are of type `int`.\n`int128` is a struct type made of two `uint64` fields.\n\nTo run the benchmarks\n```\nmake benchmarks\n```\n\n### Adding objects\n\nFreeLRU is ~3.5x faster than SimpleLRU, no surprise.\nBut it is also significantly faster than Go maps, which is a bit of a surprise.\n\nThis is with 0% memory overcommitment (default) and a capacity of 8192.\n ```\nBenchmarkFreeLRUAdd_int_int-20                  43097347                27.41 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_int_int128-20               42129165                28.38 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_uint32_uint64-20            98322132                11.74 ns/op            0 B/op          0 allocs/op (*)\nBenchmarkFreeLRUAdd_string_uint64-20            39122446                31.12 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_int_string-20               81920673                14.00 ns/op            0 B/op          0 allocs/op (*)\nBenchmarkSimpleLRUAdd_int_int-20                12253708                93.85 ns/op           48 B/op          1 allocs/op\nBenchmarkSimpleLRUAdd_int_int128-20             12095150                94.26 ns/op           48 B/op          1 allocs/op\nBenchmarkSimpleLRUAdd_uint32_uint64-20          12367568                92.60 ns/op           48 B/op          1 allocs/op\nBenchmarkSimpleLRUAdd_string_uint64-20          10395525               119.0 ns/op            49 B/op          1 allocs/op\nBenchmarkSimpleLRUAdd_int_string-20             12373900                94.40 ns/op           48 B/op          1 allocs/op\nBenchmarkFreeCacheAdd_int_int-20                 9691870               122.9 ns/op             1 B/op          0 allocs/op\nBenchmarkFreeCacheAdd_int_int128-20              9240273               125.6 ns/op             1 B/op          0 allocs/op\nBenchmarkFreeCacheAdd_uint32_uint64-20           8140896               132.1 ns/op             1 B/op          0 allocs/op\nBenchmarkFreeCacheAdd_string_uint64-20           8248917               137.9 ns/op             1 B/op          0 allocs/op\nBenchmarkFreeCacheAdd_int_string-20              8079253               145.0 ns/op            64 B/op          1 allocs/op\nBenchmarkRistrettoAdd_int_int-20                11102623               100.1 ns/op           109 B/op          2 allocs/op\nBenchmarkRistrettoAdd_int128_int-20             10317686               113.5 ns/op           129 B/op          4 allocs/op\nBenchmarkRistrettoAdd_uint32_uint64-20          12892147                94.28 ns/op          104 B/op          2 allocs/op\nBenchmarkRistrettoAdd_string_uint64-20          11266416               105.8 ns/op           122 B/op          3 allocs/op\nBenchmarkRistrettoAdd_int_string-20             10360814               107.4 ns/op           129 B/op          4 allocs/op\nBenchmarkMapAdd_int_int-20                      35306983                46.29 ns/op            0 B/op          0 allocs/op\nBenchmarkMapAdd_int_int128-20                   30986126                45.16 ns/op            0 B/op          0 allocs/op\nBenchmarkMapAdd_string_uint64-20                28406497                49.35 ns/op            0 B/op          0 allocs/op\n```\n(*)\nThere is an interesting affect when using increasing number (0..N) as keys in combination with FNV1a().\nThe number of collisions is strongly reduced here, thus the high performance.\nExchanging the sequential numbers with random numbers results in roughly the same performance as the other results.\n\nJust to give you an idea for 100% memory overcommitment:\nPerformance increased by ~20%.\n```\nBenchmarkFreeLRUAdd_int_int-20                  53473030                21.52 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_int_int128-20               52852280                22.10 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_uint32_uint64-20            100000000               10.15 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_string_uint64-20            49477594                24.55 ns/op            0 B/op          0 allocs/op\nBenchmarkFreeLRUAdd_int_string-20               85288306                12.10 ns/op            0 B/op          0 allocs/op\n```\n\n### Getting objects\n\nThis is with 0% memory overcommitment (default) and a capacity of 8192.\n```\nBenchmarkFreeLRUGet-20                          83158561                13.80 ns/op            0 B/op          0 allocs/op\nBenchmarkSimpleLRUGet-20                        146248706                8.199 ns/op           0 B/op          0 allocs/op\nBenchmarkFreeCacheGet-20                        58229779                19.56 ns/op            0 B/op          0 allocs/op\nBenchmarkRistrettoGet-20                        31157457                35.37 ns/op           10 B/op          1 allocs/op\nBenchmarkPhusluGet-20                           55071919                20.63 ns/op            0 B/op          0 allocs/op\nBenchmarkMapGet-20                              195464706                6.031 ns/op           0 B/op          0 allocs/op\n```\n\n## Example usage\n\n```go\npackage main\n\nimport (\n\t\"fmt\"\n\n\t\"github.com/cespare/xxhash/v2\"\n\n\t\"github.com/elastic/go-freelru\"\n)\n\n// more hash function in https://github.com/elastic/go-freelru/blob/main/bench/hash.go\nfunc hashStringXXHASH(s string) uint32 {\n\treturn uint32(xxhash.Sum64String(s))\n}\n\nfunc main() {\n\tlru, err := freelru.New[string, uint64](8192, hashStringXXHASH)\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\n\tkey := \"go-freelru\"\n\tval := uint64(999)\n\tlru.Add(key, val)\n\n\tif v, ok := lru.Get(key); ok {\n\t\tfmt.Printf(\"found %v=%v\\n\", key, v)\n\t}\n\n\t// Output:\n\t// found go-freelru=999\n}\n```\n\nThe function `hashInt(int) uint32` will be called to calculate a hash value of the key.\nPlease take a look into `bench/` directory to find examples of hash functions.\nHere you will also find an amd64 version of the Go internal hash function, which uses AESENC features\nof the CPU.\n\nIn case you already have a hash that you want to use as the key, you have to provide an \"identity\" function.\n\n## Comparison of hash functions\nHashing `int`\n```\nBenchmarkHashInt_MapHash-20                             181521530                6.806 ns/op           0 B/op          0 allocs/op\nBenchmarkHashInt_MapHasher-20                           727805824                1.595 ns/op           0 B/op          0 allocs/op\nBenchmarkHashInt_FNV1A-20                               621439513                1.919 ns/op           0 B/op          0 allocs/op\nBenchmarkHashInt_FNV1AUnsafe-20                         706583145                1.699 ns/op           0 B/op          0 allocs/op\nBenchmarkHashInt_AESENC-20                              1000000000               0.9659 ns/op          0 B/op          0 allocs/op\nBenchmarkHashInt_XXHASH-20                              516779404                2.341 ns/op           0 B/op          0 allocs/op\nBenchmarkHashInt_XXH3HASH-20                            562645186                2.127 ns/op           0 B/op          0 allocs/op\n```\nHashing `string`\n```\nBenchmarkHashString_MapHash-20                          72106830                15.80 ns/op            0 B/op          0 allocs/op\nBenchmarkHashString_MapHasher-20                        385338830                2.868 ns/op           0 B/op          0 allocs/op\nBenchmarkHashString_FNV1A-20                            60162328                19.33 ns/op            0 B/op          0 allocs/op\nBenchmarkHashString_AESENC-20                           475896514                2.472 ns/op           0 B/op          0 allocs/op\nBenchmarkHashString_XXHASH-20                           185842404                6.476 ns/op           0 B/op          0 allocs/op\nBenchmarkHashString_XXH3HASH-20                         375255375                3.182 ns/op           0 B/op          0 allocs/op\n```\nAs you can see, the speed depends on the object type to hash. I think, it mostly boils down to the size of the object.\n`MapHasher` is dangerous to use because it is not guaranteed to be stable across Go versions.\n`AESENC` uses the AES CPU extensions on X86-64. In theory, it should work on ARM64 as well (not tested by me). \n\nFor a small number of bytes, `FNV1A` is the fastest. \nOtherwise, `XXH3` looks like a good choice.\n\n## License\nThe code is licensed under the Apache 2.0 license. See the `LICENSE` file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felastic%2Fgo-freelru","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Felastic%2Fgo-freelru","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felastic%2Fgo-freelru/lists"}