{"id":13617470,"url":"https://github.com/snazy/ohc","last_synced_at":"2025-05-14T01:08:53.759Z","repository":{"id":23522054,"uuid":"26888367","full_name":"snazy/ohc","owner":"snazy","description":"Java large off heap cache","archived":false,"fork":false,"pushed_at":"2024-09-12T16:42:16.000Z","size":1192,"stargazers_count":1073,"open_issues_count":24,"forks_count":182,"subscribers_count":60,"default_branch":"develop","last_synced_at":"2025-04-01T21:16:40.828Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/snazy.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGES.txt","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-11-20T00:28:09.000Z","updated_at":"2025-03-25T11:57:16.000Z","dependencies_parsed_at":"2022-08-09T09:15:24.244Z","dependency_job_id":"d484947e-ccbd-48cb-b499-aece2aeda8ba","html_url":"https://github.com/snazy/ohc","commit_stats":{"total_commits":396,"total_committers":9,"mean_commits":44.0,"dds":"0.045454545454545414","last_synced_commit":"7f59c264fe8ae4c859b9662c8cea15620f0f55f8"},"previous_names":[],"tags_count":25,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snazy%2Fohc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snazy%2Fohc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snazy%2Fohc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snazy%2Fohc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/snazy","download_url":"https://codeload.github.com/snazy/ohc/tar.gz/refs/heads/develop","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247934841,"owners_count":21020729,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T20:01:42.212Z","updated_at":"2025-04-08T22:19:13.172Z","avatar_url":"https://github.com/snazy.png","language":"Java","funding_links":[],"categories":["Java","缓存库","Memory and concurrency"],"sub_categories":[],"readme":"OHC - An off-heap-cache\n=======================\n\n\nTHIS PROJECT IS NO LONGER MAINTAINED!!!\n\n\n\n\n\nFeatures\n========\n\n- asynchronous cache loader support\n- optional per entry or default TTL/expireAt\n- entry eviction and expiration without a separate thread\n- capable of maintaining huge amounts of cache memory\n- suitable for tiny/small entries with low overhead using the chunked implementation\n- runs with Java 8 and Java 11 - support for Java 7 and earlier has been dropped with version 0.7.0\n- to build OHC from source, Java 11 or newer (tested with Java 11 + 15) is required\n\nPerformance\n===========\n\nOHC shall provide a good performance on both commodity hardware and big systems using non-uniform-memory-architectures.\n\nNo performance test results available yet - you may try the ohc-benchmark tool. See instructions below.\nA very basic impression on the speed is in the _Benchmarking_ section.\n\nRequirements\n============\n\nJava 8 VM that support 64bit and has ``sun.misc.Unsafe`` (Oracle JVMs on x64 Intel CPUs).\n\nOHC is targeted for Linux and OSX. It *should* work on Windows and other Unix OSs.\n\nArchitecture\n============\n\nOHC provides two implementations for different cache entry characteristics:\n- The _linked_ implementation allocates off-heap memory for each entry individually and works best for medium and big entries.\n- The _chunked_ implementation allocates off-heap memory for each hash segment as a whole and is intended for small entries.\n\nLinked implementation\n---------------------\n\nThe number of segments is configured via ``org.caffinitas.ohc.OHCacheBuilder``, defaults to ``# of cpus * 2`` and must\nbe a power of 2. Entries are distribtued over the segments using the most significant bits of the 64 bit hash code.\nAccesses on each segment are synchronized.\n\nEach hash-map entry is allocated individually. Entries are free'd (deallocated), when they are no longer referenced by\nthe off-heap map itself or any external reference like ``org.caffinitas.ohc.DirectValueAccess`` or a\n``org.caffinitas.ohc.CacheSerializer``.\n\nThe design of this implementation reduces the locked time of a segment to a very short time. Put/replace operations\nallocate memory first, call the ``org.caffinitas.ohc.CacheSerializer`` to serialize the key and value and then put the\nfully prepared entry into the segment.\n\nEviction is performed using an LRU algorithm. A linked list through all cached elements per segment is used to keep\ntrack of the eldest entries.\n\nChunked implementation\n----------------------\n\nChunked memory allocation off-heap implementation.\n\nPurpose of this implementation is to reduce the overhead for relatively small cache entries compared to the linked\nimplementation since the memory for the whole segment is pre-allocated. This implementation is suitable for small\nentries with fast (de)serialization implementations of ``org.caffinitas.ohc.CacheSerializer``.\n\nSegmentation is the same as in the linked implementation. The number of segments is configured via\n``org.caffinitas.ohc.OHCacheBuilder``, defaults to ``# of cpus * 2`` and must be a power of 2. Entries are distribtued\nover the segments using the most significant bits of the 64 bit hash code. Accesses on each segment are synchronized.\n\nEach segment is divided into multiple chunks. Each segment is responsible for a portion of the total capacity\n``(capacity / segmentCount)``. This amount of memory is allocated once up-front during initialization and logically\ndivided into a configurable number of chunks. The size of each chunk is configured using the ``chunkSize`` option in\n``org.caffinitas.ohc.OHCacheBuilder``.\n\nLike the linked implementation, hash entries are serialized into a temporary buffer first, before the actual put\ninto a segment occurs (segement operations are synchronized).\n\nNew entries are placed into the current write chunk. When that chunk is full, the next empty chunk will become the new\nwrite chunk. When all chunks are full, the least recently used chunk, including all the entries it contains, is evicted.\n\nSpecifying the ``fixedKeyLength`` and ``fixedValueLength`` builder properties reduces the memory footprint by\n8 bytes per entry.\n\nSerialization, direct access and get-with-loader functions are not supported in this implementation.\n\nTo enable the chunked implementation, specify the ``chunkSize`` in ``org.caffinitas.ohc.OHCacheBuilder``.\n\nNote: the chunked implementation should still be considered experimental.\n\nEviction algorithms\n===================\n\nOHC supports three eviction algorithms:\n\n- *LRU*: The oldest (least recently used) entries are evicted to make room for new entries.\n- *Window Tiny-LFU*:\n  Entries with lower usage frequency are evicted to make room for new entries.\n  The goal of this eviction algorithm is to prevent heavily used entries from being evicted.\n  Note that the maximum size of entries is limited to the size of the eden generation, which is currently\n  fixed at 20% of the segment size (i.e. overall capacity / number of segments).\n  Each OHC cache segment is divided into an eden and a main \"generation\". New entries start in the eden generation\n  to give these time to build up their usage frequencies. When the eden generation becomes full, entries in the\n  eden generation have to pass the admission filter, which checks the frequencies of the entries in the eden\n  generation against the frequencies of the oldest (least recently used) entries in the main generation.\n  See `this article \u003chttp://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html\u003e`_ for a more thorough\n  description.\n  (Only supported in the _linked_ implementation, not supported by the chunked implementation)\n- *None*: OHC performs no eviction on its own. It is up to the caller to check the return values and monitor\n  free capacity.\n  (Only supported in the _linked_ implementation, not supported by the chunked implementation)\n\nConfiguration\n=============\n\nUse the class ``OHCacheBuilder`` to configure all necessary parameter like\n\n- number of segments (must be a power of 2), defaults to number-of-cores * 2\n- hash table size (must be a power of 2), defaults to 8192\n- load factor, defaults to .75\n- capacity for data over the whole cache\n- key and value serializers\n- default TTL\n- optional unlocked mode\n\nGenerally you should work with a large hash table. The larger the hash table, the shorter the linked-list in each\nhash partition - that means less linked-link walks and increased performance.\n\nThe total amount of required off heap memory is the *total capacity* plus *hash table*. Each hash bucket (currently)\nrequires 8 bytes - so the formula is ``capacity + segment_count * hash_table_size * 8``.\n\nOHC allocates off-heap memory directly bypassing Java's off-heap memory limitation. This means, that all\nmemory allocated by OHC is not counted towards ``-XX:maxDirectMemorySize``.\n\nMemory \u0026 jemalloc\n=================\n\nSince especially the linked implementation performs alloc/free operations for each individual entry, consider that\nmemory fragmentation can happen.\n\nAlso leave some head room since some allocations might still be in flight and also \"the other stuff\"\n(operating system, JVM, etc) need memory. It depends on the usage pattern how much head room is necessary.\nNote that the linked implementation allocates memory during write operations _before_ it is counted towards the\nsegments, which will evict older entries. This means: do not dedicate all available memory to OHC.\n\nWe recommend using jemalloc to keep fragmentation low. On Unix operating systems, preload jemalloc.\n\nOSX usually does not require jemalloc for performance reasons. Also make sure that you are using a recent version of\njemalloc - some Linux distributions still provide quite old versions.\n\nTo preload jemalloc on Linux, use\n``export LD_PRELOAD=\u003cpath-to-libjemalloc.so``, to preload jemalloc on OSX, use\n``export DYLD_INSERT_LIBRARIES=\u003cpath-to-libjemalloc.so``. A script template for preloading can be found at the\n`Apache Cassandra project \u003chttps://github.com/apache/cassandra/blob/bf3255fc93db65b816b016958967003df38a6004/bin/cassandra#L135-L182\u003e`_.\n\nUsage\n=====\n\nQuickstart::\n\n OHCache ohCache = OHCacheBuilder.newBuilder()\n                                 .keySerializer(yourKeySerializer)\n                                 .valueSerializer(yourValueSerializer)\n                                 .build();\n\nThis quickstart uses the very least default configuration:\n\n- total cache capacity of 64MB or 16 * number-of-cpus, whichever is smaller\n- number of segments is 2 * number of cores\n- 8192 buckets per segment\n- load factor of .75\n- your custom key serializer\n- your custom value serializer\n- no maximum serialized cache entry size\n\nSee javadoc of ``CacheBuilder`` for a complete list of options.\n\nKey and value serializers need to implement the ``CacheSerializer`` interface. This interface has three methods:\n\n- ``int serializedSize(T t)`` to return the serialized size of the given object\n- ``void serialize(Object obj, DataOutput out)`` to serialize the given object to the data output\n- ``T deserialize(DataInput in)`` to deserialize an object from the data input\n\nBuilding from source\n====================\n\nClone the git repo to your local machine. Either use the stable master branch or a release tag.\n\n``git clone https://github.com/snazy/ohc.git``\n\nYou need OpenJDK 11 or newer to build from source. Just execute\n\n``mvn clean install``\n\nBenchmarking\n============\n\nYou need to build OHC from source because the big benchmark artifacts are not uploaded to Maven Central.\n\nExecute ``java -jar ohc-benchmark/target/ohc-benchmark-0.7.1-SNAPSHOT.jar -h`` (when building from source)\nto get some help information.\n\nGenerally the benchmark tool starts a bunch of threads and performs _get_ and _put_ operations concurrently\nusing configurable key distributions for _get_ and _put_ operations. Value size distribution also needs to be configured.\n\nAvailable command line options::\n\n -cap \u003carg\u003e    size of the cache\n -d \u003carg\u003e      benchmark duration in seconds\n -h            help, print this command\n -lf \u003carg\u003e     hash table load factor\n -r \u003carg\u003e      read-write ration (as a double 0..1 representing the chance for a read)\n -rkd \u003carg\u003e    hot key use distribution - default: uniform(1..10000)\n -sc \u003carg\u003e     number of segments (number of individual off-heap-maps)\n -t \u003carg\u003e      threads for execution\n -vs \u003carg\u003e     value sizes - default: fixed(512)\n -wkd \u003carg\u003e    hot key use distribution - default: uniform(1..10000)\n -wu \u003carg\u003e     warm up - \u003cwork-secs\u003e,\u003csleep-secs\u003e\n -z \u003carg\u003e      hash table size\n -cs \u003carg\u003e     chunk size - if specified it will use the \"chunked\" implementation\n -fks \u003carg\u003e    fixed key size in bytes\n -fvs \u003carg\u003e    fixed value size in bytes\n -mes \u003carg\u003e    max entry size in bytes\n -unl          do not use locking - only appropiate for single-threaded mode\n -hm \u003carg\u003e     hash algorithm to use - MURMUR3, XX, CRC32\n -bh           show bucket historgram in stats\n -kl \u003carg\u003e     enable bucket histogram. Default: false\n\nDistributions for read keys, write keys and value sizes can be configured using the following functions::\n\n EXP(min..max)                        An exponential distribution over the range [min..max]\n EXTREME(min..max,shape)              An extreme value (Weibull) distribution over the range [min..max]\n QEXTREME(min..max,shape,quantas)     An extreme value, split into quantas, within which the chance of selection is uniform\n GAUSSIAN(min..max,stdvrng)           A gaussian/normal distribution, where mean=(min+max)/2, and stdev is (mean-min)/stdvrng\n GAUSSIAN(min..max,mean,stdev)        A gaussian/normal distribution, with explicitly defined mean and stdev\n UNIFORM(min..max)                    A uniform distribution over the range [min, max]\n FIXED(val)                           A fixed distribution, always returning the same value\n Preceding the name with ~ will invert the distribution, e.g. ~exp(1..10) will yield 10 most, instead of least, often\n Aliases: extr, qextr, gauss, normal, norm, weibull\n\n(Note: these are similar to the Apache Cassandra stress tool - if you know one, you know both ;)\n\nQuick example with a read/write ratio of ``.9``, approx 1.5GB max capacity, 16 threads that runs for 30 seconds::\n\n java -jar ohc-benchmark/target/ohc-benchmark-0.5.1-SNAPSHOT.jar\n\n\n(Note that the version in the jar file name might differ.)\n\nOn a 2.6GHz Core i7 system (OSX) the following numbers are typical running the above benchmark (.9 read/write ratio):\n\n- # of gets per second: 2500000\n- # of puts per second:  270000\n\nWhy off-heap memory\n===================\n\nWhen using a very huge number of objects in a very large heap, Virtual machines will suffer from increased GC\npressure since it basically has to inspect each and every object whether it can be collected and has to access all\nmemory pages. A cache shall keep a hot set of objects accessible for fast access (e.g. omit disk or network\nroundtrips). The only solution is to use native memory - and there you will end up with the choice either\nto use some native code (C/C++) via JNI or use direct memory access.\n\nNative code using C/C++ via JNI has the drawback that you have to naturally write C/C++ code for each and\nevery platform. Although most Unix OS (Linux, OSX, BSD, Solaris) are quite similar when dealing with things\nlike compare-and-swap or Posix libraries, you usually also want to support the other platform (Windows).\n\nBoth native code and direct memory access have the drawback that they have to \"leave\" the JVM \"context\" -\nwant to say that access to off heap memory is slower than access to data in the Java heap and that each JNI call\nhas some \"escape from JVM context\" cost.\n\nBut off heap memory is great when you have to deal with a huge amount of several/many GB of cache memory since\nthat dos not put any pressure on the Java garbage collector. Let the Java GC do its job for the application where\nthis library does its job for the cached data.\n\nWhy *not* use ByteBuffer.allocateDirect()?\n==========================================\n\nTL;DR allocating off-heap memory directly and bypassing ``ByteBuffer.allocateDirect`` is very gentle to the\nGC and we have explicit control over memory allocation and, more importantly, free. The stock implementation\nin Java frees off-heap memory during a garbage collection - also: if no more off-heap memory is available, it\nlikely triggers a Full-GC, which is problematic if multiple threads run into that situation concurrently since\nit means lots of Full-GCs sequentially. Further, the stock implementation uses a global, synchronized linked\nlist to track off-heap memory allocations.\n\nThis is why OHC allocates off-heap memory directly and recommends to preload jemalloc on Linux systems to\nimprove memory managment performance.\n\nHistory\n=======\n\nOHC was developed in 2014/15 for `Apache Cassandra \u003chttp://cassandra.apache.org/\u003e`_ 2.2 and 3.0 to be used as the\n`new row-cache backend \u003chttps://issues.apache.org/jira/browse/CASSANDRA-7438\u003e`_.\n\nSince there were no suitable fully off-heap cache implementations available, it has been decided to\nbuild a completely new one - and that's OHC. But it turned out that OHC alone might also be usable for\nother projects - that's why OHC is a separate library.\n\nContributors\n============\n\nA big 'thank you' has to go to `Benedict Elliott Smith \u003chttps://twitter.com/_belliottsmith\u003e`_ and\n`Ariel Weisberg \u003chttps://twitter.com/ArielWeisberg\u003e`_ from DataStax for their very useful input to OHC!\n\n`Ben Manes \u003chttps://twitter.com/benmanes\u003e`_, the author of `Caffeine \u003chttps://github.com/ben-manes/caffeine/\u003e`_,\nthe highly configurable on-heap cache using W-Tiny LFU.\n\nDeveloper: `Robert Stupp \u003chttps://twitter.com/snazy\u003e`_\n\nLicense\n=======\n\nCopyright (C) 2014 Robert Stupp, Koeln, Germany, robert-stupp.de\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\nhttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsnazy%2Fohc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsnazy%2Fohc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsnazy%2Fohc/lists"}