{"id":13416507,"url":"https://github.com/uber/kraken","last_synced_at":"2025-05-12T13:20:31.366Z","repository":{"id":37601975,"uuid":"160626997","full_name":"uber/kraken","owner":"uber","description":"P2P Docker registry capable of distributing TBs of data in seconds","archived":false,"fork":false,"pushed_at":"2025-04-22T10:46:52.000Z","size":6928,"stargazers_count":6325,"open_issues_count":87,"forks_count":431,"subscribers_count":83,"default_branch":"master","last_synced_at":"2025-04-23T16:02:11.178Z","etag":null,"topics":["bittorrent","container","containerd","docker","docker-image","docker-registry","p2p"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/uber.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"docs/ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-12-06T06:04:35.000Z","updated_at":"2025-04-23T14:15:59.000Z","dependencies_parsed_at":"2023-01-31T14:15:34.261Z","dependency_job_id":"49c7acb7-f74c-4ab5-b293-4cd40b3af776","html_url":"https://github.com/uber/kraken","commit_stats":{"total_commits":873,"total_committers":42,"mean_commits":"20.785714285714285","dds":0.5372279495990836,"last_synced_commit":"8c3089321243c930ee3c2f1c9388c31e296726c9"},"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uber%2Fkraken","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uber%2Fkraken/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uber%2Fkraken/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uber%2Fkraken/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/uber","download_url":"https://codeload.github.com/uber/kraken/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253745197,"owners_count":21957320,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bittorrent","container","containerd","docker","docker-image","docker-registry","p2p"],"created_at":"2024-07-30T21:00:59.914Z","updated_at":"2025-05-12T13:20:31.346Z","avatar_url":"https://github.com/uber.png","language":"Go","readme":"\u003cp align=\"center\"\u003e\u003cimg src=\"assets/kraken-logo-color.svg\" width=\"175\" title=\"Kraken Logo\"\u003e\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n\u003c/a\u003e\n\u003ca href=\"https://travis-ci.com/uber/kraken\"\u003e\u003cimg src=\"https://travis-ci.com/uber/kraken.svg?branch=master\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/uber/kraken/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/release/uber/kraken.svg\" /\u003e\u003c/a\u003e\n\u003ca href=\"https://godoc.org/github.com/uber/kraken\"\u003e\u003cimg src=\"https://godoc.org/github.com/uber/kraken?status.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://goreportcard.com/badge/github.com/uber/kraken\"\u003e\u003cimg src=\"https://goreportcard.com/badge/github.com/uber/kraken\"\u003e\u003c/a\u003e\n\u003ca href=\"https://codecov.io/gh/uber/kraken\"\u003e\u003cimg src=\"https://codecov.io/gh/uber/kraken/branch/master/graph/badge.svg\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\nKraken is a P2P-powered Docker registry that focuses on scalability and availability. It is\ndesigned for Docker image management, replication, and distribution in a hybrid cloud environment.\nWith pluggable backend support, Kraken can easily integrate into existing Docker registry setups\nas the distribution layer. \n\nKraken has been in production at Uber since early 2018. In our busiest cluster, Kraken distributes\nmore than 1 million blobs per day, including 100k 1G+ blobs. At its peak production load, Kraken\ndistributes 20K 100MB-1G blobs in under 30 sec.\n\nBelow is the visualization of a small Kraken cluster at work:\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/visualization.gif\" title=\"Visualization\"\u003e\n\u003c/p\u003e\n\n# Table of Contents\n\n- [Features](#features)\n- [Design](#design)\n- [Architecture](#architecture)\n- [Benchmark](#benchmark)\n- [Usage](#usage)\n- [Comparison With Other Projects](#comparison-with-other-projects)\n- [Limitations](#limitations)\n- [Contributing](#contributing)\n- [Contact](#contact)\n\n# Features\n\nFollowing are some highlights of Kraken:\n- **Highly scalable**. Kraken is capable of distributing Docker images at \u003e 50% of max download\n  the speed limit on every host. Cluster size and image size do not have a significant impact on\n  download speed.\n  - Supports at least 15k hosts per cluster.\n  - Supports arbitrarily large blobs/layers. We normally limit max size to 20G for the best performance.\n- **Highly available**. No component is a single point of failure.\n- **Secure**. Support uploader authentication and data integrity protection through TLS.\n- **Pluggable storage options**. Instead of managing data, Kraken plugs into reliable blob storage\n  options, like S3, GCS, ECR, HDFS or another registry. The storage interface is simple and new\n  options are easy to add.\n- **Lossless cross-cluster replication**. Kraken supports rule-based async replication between\n  clusters.\n- **Minimal dependencies**. Other than pluggable storage, Kraken only has an optional dependency on\n  DNS.\n\n# Design\n\nThe high-level idea of Kraken is to have a small number of dedicated hosts seeding content to a\nnetwork of agents running on each host in the cluster.\n\nA central component, the tracker, will orchestrate all participants in the network to form a\npseudo-random regular graph.\n\nSuch a graph has high connectivity and a small diameter. As a result, even with only one seeder and\nhaving thousands of peers joining in the same second, all participants can reach a minimum of 80%\nmax upload/download speed in theory (60% with current implementation), and performance doesn't\ndegrade much as the blob size and cluster size increase. For more details, see the team's [tech\ntalk](https://www.youtube.com/watch?v=waVtYYSXkXU) at KubeCon + CloudNativeCon.\n\n# Architecture\n\n![](assets/architecture.svg)\n\n- Agent\n  - Deployed on every host\n  - Implements Docker registry interface\n  - Announces available content to tracker\n  - Connects to peers returned by the tracker to download content\n- Origin\n  - Dedicated seeders\n  - Stores blobs as files on disk backed by pluggable storage (e.g. S3, GCS, ECR)\n  - Forms a self-healing hash ring to distribute the load\n- Tracker\n  - Tracks which peers have what content (both in-progress and completed)\n  - Provides ordered lists of peers to connect to for any given blob\n- Proxy\n  - Implements Docker registry interface\n  - Uploads each image layer to the responsible origin (remember, origins form a hash ring)\n  - Uploads tags to build-index\n- Build-Index\n  - Mapping of the human-readable tag to blob digest\n  - No consistency guarantees: the client should use unique tags\n  - Powers image replication between clusters (simple duplicated queues with retry)\n  - Stores tags as files on disk backed by pluggable storage (e.g. S3, GCS, ECR)\n\n# Benchmark\n\nThe following data is from a test where a 3G Docker image with 2 layers is downloaded by 2600 hosts\nconcurrently (5200 blob downloads), with 300MB/s speed limit on all agents (using 5 trackers and\n5 origins):\n\n![](assets/benchmark.svg)\n\n- p50 = 10s (at speed limit)\n- p99 = 18s\n- p99.9 = 22s\n\n# Usage\n\nAll Kraken components can be deployed as Docker containers. To build the Docker images:\n```\n$ make images\n```\nFor information about how to configure and use Kraken, please refer to the [documentation](docs/CONFIGURATION.md).\n\n## Kraken on Kubernetes\n\nYou can use our example Helm chart to deploy Kraken (with an example HTTP fileserver backend) on\nyour k8s cluster:\n```\n$ helm install --name=kraken-demo ./helm\n```\nOnce deployed, every node will have a docker registry API exposed on `localhost:30081`.\nFor example pod spec that pulls images from Kraken agent, see [example](examples/k8s/demo.json).\n\nFor more information on k8s setup, see [README](examples/k8s/README.md).\n\n## Devcluster\n\nTo start a herd container (which contains origin, tracker, build-index and proxy) and two agent\ncontainers with development configuration:\n```\n$ make devcluster\n```\n\nDocker-for-Mac is required for making dev-cluster work on your laptop.\nFor more information on devcluster, please check out devcluster [README](examples/devcluster/README.md).\n\n# Comparison With Other Projects\n\n## Dragonfly from Alibaba\n\nDragonfly cluster has one or a few \"supernodes\" that coordinates the transfer of every 4MB chunk of data\nin the cluster.\n\nWhile the supernode would be able to make optimal decisions, the throughput of the whole cluster is\nlimited by the processing power of one or a few hosts, and the performance would degrade linearly as\neither blob size or cluster size increases.\n\nKraken's tracker only helps orchestrate the connection graph and leaves the negotiation of actual data\ntransfer to individual peers, so Kraken scales better with large blobs.\nOn top of that, Kraken is HA and supports cross-cluster replication, both are required for a\nreliable hybrid cloud setup.\n\n## BitTorrent\n\nKraken was initially built with a BitTorrent driver, however, we ended up implementing our P2P\ndriver based on BitTorrent protocol to allow for tighter integration with storage solutions and more\ncontrol over performance optimizations.\n\nKraken's problem space is slightly different than what BitTorrent was designed for. Kraken's goal is\nto reduce global max download time and communication overhead in a stable environment, while\nBitTorrent was designed for an unpredictable and adversarial environment, so it needs to preserve more\ncopies of scarce data and defend against malicious or bad behaving peers.\n\nDespite the differences, we re-examine Kraken's protocol from time to time, and if it's feasible, we\nhope to make it compatible with BitTorrent again.\n\n# Limitations\n\n- If Docker registry throughput is not the bottleneck in your deployment workflow, switching to\nKraken will not magically speed up your `docker pull`. To speed up `docker pull`, consider\nswitching to [Makisu](https://github.com/uber/makisu) to improve layer reusability at build time, or\ntweak compression ratios, as `docker pull` spends most of the time on data decompression.\n- Mutating tags (e.g. updating a `latest` tag) is allowed, however, a few things will not work: tag\nlookups immediately afterwards will still return the old value due to Nginx caching, and replication\nprobably won't trigger. We are working on supporting this functionality better. If you need tag\nmutation support right now, please reduce the cache interval of the build-index component. If you also need\nreplication in a multi-cluster setup, please consider setting up another Docker registry as Kraken's\nbackend.\n- Theoretically, Kraken should distribute blobs of any size without significant performance\ndegradation, but at Uber, we enforce a 20G limit and cannot endorse the production use of\nultra-large blobs (i.e. 100G+). Peers enforce connection limits on a per blob basis, and new peers\nmight be starved for connections if no peers become seeders relatively soon. If you have ultra-large\nblobs you'd like to distribute, we recommend breaking them into \u003c10G chunks first.\n\n# Contributing\n\nPlease check out our [guide](docs/CONTRIBUTING.md).\n\n# Contact\n\nTo contact us, please join our [Slack channel](https://join.slack.com/t/uber-container-tools/shared_invite/enQtNTIxODAwMDEzNjM1LWIwYzIxNmUwOGY3MmVmM2MxYTczOTQ4ZDU0YjAxMTA0NDgyNzdlZTA4ZWVkZGNlMDUzZDA1ZTJiZTQ4ZDY0YTM).\n","funding_links":[],"categories":["Docker Images","Go","Misc","By Industry","docker","p2p","By Language","Projects","Kubernetes"],"sub_categories":["Registry","DevOps","Go","Data Storage and Sharing","Kubernetes // Image Registries and Image Distribution"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuber%2Fkraken","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fuber%2Fkraken","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuber%2Fkraken/lists"}