{"id":21965860,"url":"https://github.com/dasebe/adaptsize","last_synced_at":"2025-04-24T03:47:11.851Z","repository":{"id":92429999,"uuid":"81961714","full_name":"dasebe/AdaptSize","owner":"dasebe","description":"A caching system that maximizes hit ratios under highly variable traffic.","archived":false,"fork":false,"pushed_at":"2018-01-30T18:25:08.000Z","size":119,"stargazers_count":41,"open_issues_count":1,"forks_count":5,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-04-24T03:47:06.306Z","etag":null,"topics":["c","cache","caching-strategies","reverse-proxy","webcache"],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dasebe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-02-14T15:50:45.000Z","updated_at":"2025-03-11T16:16:38.000Z","dependencies_parsed_at":"2023-03-08T20:30:35.767Z","dependency_job_id":null,"html_url":"https://github.com/dasebe/AdaptSize","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dasebe%2FAdaptSize","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dasebe%2FAdaptSize/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dasebe%2FAdaptSize/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dasebe%2FAdaptSize/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dasebe","download_url":"https://codeload.github.com/dasebe/AdaptSize/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250560011,"owners_count":21450168,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","cache","caching-strategies","reverse-proxy","webcache"],"created_at":"2024-11-29T12:52:55.054Z","updated_at":"2025-04-24T03:47:11.841Z","avatar_url":"https://github.com/dasebe.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# The AdaptSize Caching System\n\nAdaptSize is a caching system for the first-level memory cache in a CDN or in a reverse proxy of a large website.\n\nCDN Memory caches serve high traffic volumes and are rarely sharded (sharding is used for second-level SSD caches). Typically, this means that hit ratios of first-level memory caches are low and highly variable.\n\nAdaptSize's mission is\n\n - to maximize memory cache hit ratios for CDN workloads,\n - to make the cache robust against traffic variability,\n - while not imposing any throughput overhead.\n\nAdaptSize is built on top of [Varnish Cache](https://github.com/varnishcache/varnish-cache/), the \"high-performance HTTP accelerator\".\n\nA detailed description of AdaptSize is available in our [Paper (PDF)](https://www.usenix.org/system/files/conference/nsdi17/nsdi17-berger.pdf) and our [talk slides (PDF)](https://www.usenix.org/sites/default/files/conference/protected-files/nsdi17_slides_berger.pdf)  or as [audio/video recording](https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/berger).\n\n## Example: comparison to Varnish cache on production traffic\n\nWe replay a production trace from an edge cache that serves highly multiplexed traffic with a variety of different traffic patterns. An unmodified Varnish version achieves an average hit ratio of 0.42. We find that many web objects are evicted before being requested again. Varnish performance is also highly variable: the hit ratio's [coefficient of variation](https://en.wikipedia.org/wiki/Coefficient_of_variation) is 23%.\n\nAdaptSize achieves a hit ratio of 0.66, which is a 1.57x improvement over unmodified Varnish. Additionally, AdaptSize stabilizes performance: the hit ratio's coefficient of variation is 5%, which is a 4.6x improvement.\n\n![Hit ratio of AdaptSize and Varnish on a production trace](https://cloud.githubusercontent.com/assets/9959772/22971000/796f6354-f374-11e6-8993-d454c6fb8f4b.png)\n\n**Figure 1: AdaptSize consistently improves the hit ratio when compared to unmodified Varnish.**\n\nWhile AdaptSize significantly improves the hit ratio, it does not impose any throughput overhead. Specifically, AdaptSize does not add any synchronization locks and thus scales exactly like an unmodified Varnish does.\n\n![Throughput of AdaptSize and Varnish](https://cloud.githubusercontent.com/assets/9959772/22971202/40cf0576-f375-11e6-933f-d5c4722b0ab0.png)\n\n**Figure 2: AdaptSize achieves the same throughput as Varnish, for any hit ratio. Left plot shows high hit ratio scenario, right plot shows low hit ratio scenario.**\n\n## How AdaptSize works\n\nAdaptSize is a new caching policy. Caching policies make two types of decisions, which objects to admit into the cache, and which ones to evict.\n\nAlmost all prior work on caching policies focuses on the eviction policy (see [this Wikipedia article](https://en.wikipedia.org/wiki/Cache_replacement_policies) or the [webcachesim code base](https://github.com/dasebe/webcachesim)). Popular eviction policies are often LRU or FIFO variants. Varnish uses a \"concurrent\" LRU variant and admits every object by default.\n\nTo see why eviction policies by themselves are not enough, consider this scenario.\n\n\u003e Imagine that there are only two types of objects: 9999 small objects of size 100 KB (say, small web pages) and 1 large object of size 500 MB (say, a software download). Further, assume that all objects are equally popular and requested forever in round-robin order. Suppose that our HOC has a capacity of 1 GB.\n\u003e A HOC that does not use admission control cannot achieve an OHR above 0.5. Every time the large object is requested, it pushes out ~5000 small objects. It does not matter which objects are evicted: when the evicted objects are requested, they cannot contribute to the OHR.\n\nWhile this simplifies things a lot, variants of this actually happen frequently under real production traffic. In fact, production traces regularly contain requests to objects between one 1B and several GBs.\n\nNote that a simple admission policy can boost the hit ratio in the toy scenario above. For example, if the cache admits no object above 100 KB, the overall hit ratio will almost double to 0.99. Unfortunately, simple size thresholds like this are not very robust against changes in the request traffic.\n\nAdaptSize uses a new admission policy that incorporates both object size and popularity. The idea is to use a probability that depends on the object size: \n\n- small objects are admitted with high proabability\n- medium-sized objects are amitted with a small probability, so if frequently requested they'll get admitted eventually\n- very large objects have such a small admission probability, they rarely get admitted (unless very popular).\n\nAdaptSize continuously optimizes this mapping of size to admission probability using a new mathematical model. This model works based on observations of the most recent traffic and is used to derive the admission policy that maximizes the cache hit ratio.\n\n\n## Installing AdaptSize\n\nAdaptSize is a proof of concept, and not ready for production use. This repository contains the source code of the AdaptSize library (the math model) and the glue code to incorporate this model into the Varnish caching system.\n\nHere are the steps to recreate our test setup.\n\n### Step 0: Install dependencies\n\nWe need to compile Varnish from scratch and thus need [the same dependencies](https://varnish-cache.org/docs/trunk/installation/install.html).\n\nSomething like this might work\n\n    sudo apt-get install -y autotools-dev make automake libtool pkg-config\n\n\n### Step 1: Checkout AdaptSize and download Varnish source code\n\nObtain a copy of [AdaptSize](https://github.com/dasebe/AdaptSize/archive/master.zip) and [Varnish 4.1.2](https://varnish-cache.org/releases/rel4.1.2.html).\n\nUnpack AdaptSize and navigate into the AdaptSize folder. In that folder, unpack the copy into a subdirectory named varnish-4.1.2.\n\n    wget https://repo.varnish-cache.org/source/varnish-4.1.2.tar.gz\n    tar xfvz varnish-4.1.2.tar.gz\n\n\n### Step 2: Patch, compile, and install Varnish\n\nWe need to apply three small patches to the Varnish code base.\n\n    patch varnish-4.1.2/bin/varnishd/cache/cache_req_fsm.c \u003c VarnishPatches/cache_req_fsm.patch\n    patch varnish-4.1.2/include/tbl/params.h \u003c VarnishPatches/params.patch\n    patch varnish-4.1.2/lib/libvarnishapi/vsl_dispatch.c \u003c VarnishPatches/vsl_dispatch.patch\n\n\n\nThen we can compile and install as usual\n\n    cd varnish-4.1.2\n    ./configure --prefix=/usr/local/varnish/\n    make\n    make install\n    \n\n### Step 3: Compile and install AdaptSize Vmod\n\nThe AdaptSize Vmod performs the actual admission control and relies on the second patch from above.\nYou may need to adjust the config path to your actual install path.\n\n    cd AdaptSizeVmod\n    export PKG_CONFIG_PATH=/usr/local/varnish/lib/pkgconfig\n    ./autogen.sh --prefix=/usr/local/varnish\n    ./configure --prefix=/usr/local/varnish/\n    make\n    make install\n\n\n### Step 4: Compile AdaptSize tuning module\n\nThis program is run in parallel to Varnish and automatically tunes the size threshold parameter on live statistics from the cache.\n\n     cd AdaptSizeTuner\n     make\n\n### Step 5: Run an experiment\n\nCreate an experimental setup with a client and backend service, e.g., the one [we used ourselves](https://github.com/dasebe/webtracereplay).\n\nConfigure Varnish with a [VCL-file](http://varnish-cache.org/docs/4.0/reference/vcl.html) that enforces the admission decisions made by the AdaptSizeVMOD. Specifically, autoparam is called on a cache miss (aka vcl_backend_response) and is called with the object's size (aka beresp.http.Content-Length). It returns a bool which indicates whether the object should bypass the cache or not.\n\nAn example VCL could look like this:\n\n      vcl 4.0;\n      \n      # use the AdaptSizeVmod\taka autoparam\n      import autoparam;\n      \n      backend default {\n          .host = \"127.0.0.1\";\n          .port = \"8000\";\n      }\n      \n      sub vcl_backend_response {\n        if (!autoparam.explru(beresp.http.Content-Length)) {\n           set beresp.uncacheable = true;\n           set beresp.ttl = 0s;\n           return (deliver);\n        }\n      }\n\nAfter starting Varnish (and, ideally before actually sending requests to Varnish) start the AdaptSizeTuner\n\n     AdaptSizeTuner/adaptsizetuner $varnishfolder $varnishadm $cachesize ExpLRU 1\n\nYou will need to replace $varnishfolder by the path where your Varnish instance's vsm file lives (the basename of the -N parameter of varnishstat etc); $varnishadm should be the path to the varnishadm executable.\nThe final parameter is $cachesize, which should be your cache's capacity in GB.\n\n## Installing additional tools\n\nThese programs are not part of AdaptSize but were used to create plots and statistics.\n\nDetailed hit ratio statistics\n\n      cd VarnishHitStats\n      make\n\nOther types of statistics\n\n      cd VarnishOtherStats\n      make\n\n## References\n\nWe ask academic works, which built on this code, to reference the AdaptSize paper:\n\n    AdaptSize: Orchestrating the Hot Object Memory Cache in a CDN\n    Daniel S. Berger, Ramesh K. Sitaraman, Mor Harchol-Balter\n    To appear in USENIX NSDI in March 2017.\n    \nYou can find more information on [USENIX NSDI 2017 here.](https://www.usenix.org/conference/nsdi17/technical-sessions)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdasebe%2Fadaptsize","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdasebe%2Fadaptsize","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdasebe%2Fadaptsize/lists"}