{"id":22363741,"url":"https://github.com/teran/archived","last_synced_at":"2025-07-30T15:31:23.145Z","repository":{"id":247278995,"uuid":"824858706","full_name":"teran/archived","owner":"teran","description":"Cloud native service to store versioned data in space-efficient manner","archived":false,"fork":false,"pushed_at":"2024-12-02T05:54:31.000Z","size":1734,"stargazers_count":4,"open_issues_count":24,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-12-02T06:29:50.684Z","etag":null,"topics":["apt","aptitude","cas","dnf","package-mirror","package-repository","rpm","s3","yum"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/teran.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-06T06:27:24.000Z","updated_at":"2024-12-02T05:54:34.000Z","dependencies_parsed_at":"2024-08-27T22:17:59.304Z","dependency_job_id":"799f3d5a-716b-4032-af1e-386cc48a48d5","html_url":"https://github.com/teran/archived","commit_stats":null,"previous_names":["teran/archived"],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/teran%2Farchived","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/teran%2Farchived/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/teran%2Farchived/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/teran%2Farchived/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/teran","download_url":"https://codeload.github.com/teran/archived/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228154009,"owners_count":17877706,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apt","aptitude","cas","dnf","package-mirror","package-repository","rpm","s3","yum"],"created_at":"2024-12-04T17:17:00.098Z","updated_at":"2025-07-30T15:31:23.117Z","avatar_url":"https://github.com/teran.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# archived\n\n[![Verify](https://github.com/teran/archived/actions/workflows/verify.yml/badge.svg?branch=master)](https://github.com/teran/archived/actions/workflows/verify.yml)\n[![Go Report Card](https://goreportcard.com/badge/github.com/teran/archived)](https://goreportcard.com/report/github.com/teran/archived)\n[![Go Reference](https://pkg.go.dev/badge/github.com/teran/archived.svg)](https://pkg.go.dev/github.com/teran/archived)\n\nCloud native service to store versioned data in space-efficient manner\n\narchived is applicable if you have amount of low-cardinality data to share\nwith amount of users/systems. Good example of that task: APT/RPM repository.\n\n## Project status \u0026 roadmap\n\narchived is under active development and almost everything is a subject\nto change. MVP was already implemented as of v0.0.1 to prove all the concepts\nused in archived.\n\nThe complete feature list is available in the [repository issues](https://github.com/teran/archived/issues)\n\n## How it works\n\narchived is inspired by `rsync --link-dest` which allowed to store package\nmirrors without duplicating data for decades. And now archived makes this\napproach unbound from local file systems by using modern era storage services\nunder the hood like S3.\n\nTo do so archived relies on two storages: metadata and CAS.\n\nMetadata is a some kind of database to store all of the things:\n\n* namespaces - group of containers\n* containers - some kind of directories\n* versions - immutable version of the data in container\n* objects - named data BLOBs with some additional metadata\n\nGood example of metadata storage is a PostgreSQL database.\n\nCAS storage is a BLOB storage which stores the data behind objects.\nCAS is actually an acronym means Content Addressed Storage which describes\nhow exactly it operates: stores BLOBs under content aware unique key (SHA256\nis used by default).\n\nGood example of CAS storage is S3.\n\nThis approach allows to reduce raw data usage by linking duplicates instead\nif storing copies.\n\n## archived components\n\narchived is built with microservice architecture containing the following\ncomponents:\n\n* archived-publisher - HTTP server to allow data listing and fetching\n* archived-manager - gRPC API to manage namespaces, containers, versions and\n    objects\n* archived-exporter - Prometheus metrics exporter for metadata entities\n* CLI - CLI application to interact with manage component\n* migrator - metadata migration tool\n* archived-gc - garbage collector\n\n## Deploy\n\narchived is distributed as a number of prebuilt binaries which allows to choose\nany particular way to deploy it from systemd services to Kubernetes.\n\nThe main things are required to know before deployment:\n\n* archived-publisher can use RO replica of PostgreSQL for operation\n    and can scale\n* archived-manager requires RW PostgreSQL instance since it performs\n    writes, can also scale\n* archived-exporter is sufficient to run in the only copy since it just\n    provides metrics for the database stuff, RO replica access is also enough\n* archived-migrator must be ran each time archived is upgrading right before\n    other components\n* archived-cli could run anywhere and will require network access to\n    archived-manager\n* archived-gc requires RW PostgreSQL and runs periodically as a job\n* there's no authentication on any stage at the moment (yes, even for\n    cli/manager)\n\n![diagram](docs/_assets/components.png)\n\nAn example for Kubernetes deployment specs is available in\n[docs/examples/deploy/k8s](docs/examples/deploy/k8s) directory.\n\nFull configuration reference is available at [docs/configuration.md](docs/configuration.md)\nreference.\n\n## CLI\n\narchived-cli provides an CLI interface to operate archived including creating\nnamespaces, containers, versions and objects. It works with archived-manager\nto handle requests.\n\n```shell\nusage: archived-cli --endpoint=ENDPOINT [\u003cflags\u003e] \u003ccommand\u003e [\u003cargs\u003e ...]\n\nCLI interface for archived\n\n\nFlags:\n      --[no-]help            Show context-sensitive help (also try --help-long and --help-man).\n  -d, --[no-]debug           Enable debug mode ($ARCHIVED_CLI_DEBUG)\n  -t, --[no-]trace           Enable trace mode (debug mode on steroids) ($ARCHIVED_CLI_TRACE)\n  -s, --endpoint=ENDPOINT    Manager API endpoint address ($ARCHIVED_CLI_ENDPOINT)\n      --[no-]insecure        Do not use TLS for gRPC connection\n      --[no-]insecure-skip-verify\n                             Do not perform TLS certificate verification for gRPC connection\n      --cache-dir=\"~/.cache/archived/cli/objects\"\n                             Stat-cache directory for objects ($ARCHIVED_CLI_STAT_CACHE_DIR)\n  -n, --namespace=\"default\"  namespace for containers to operate on\n\nCommands:\nhelp [\u003ccommand\u003e...]\n    Show help.\n\nnamespace create \u003cname\u003e\n    create new namespace\n\nnamespace rename \u003cold-name\u003e \u003cnew-name\u003e\n    rename the given namespace\n\nnamespace delete \u003cname\u003e\n    delete the given namespace\n\nnamespace list\n    list namespaces\n\ncontainer create [\u003cflags\u003e] \u003cname\u003e\n    create new container\n\ncontainer move \u003cname\u003e \u003cnamespace\u003e\n    move container to another namespace\n\ncontainer rename \u003cold-name\u003e \u003cnew-name\u003e\n    rename the given container\n\ncontainer delete \u003cname\u003e\n    delete the given container\n\ncontainer set [\u003cflags\u003e] \u003cname\u003e\n    set parameters for container\n\ncontainer list\n    list containers\n\nversion create [\u003cflags\u003e] \u003ccontainer\u003e\n    create new version for given container\n\nversion delete \u003ccontainer\u003e \u003cversion\u003e\n    delete the given version\n\nversion list \u003ccontainer\u003e\n    list versions for the given container\n\nversion publish \u003ccontainer\u003e \u003cversion\u003e\n    publish the given version\n\nobject list \u003ccontainer\u003e \u003cversion\u003e\n    list objects in the given container and version\n\nobject url \u003ccontainer\u003e \u003cversion\u003e \u003ckey\u003e\n    get URL for the object\n\nobject delete \u003ccontainer\u003e \u003cversion\u003e \u003ckey\u003e\n    delete object\n\nstat-cache show-path\n    print actual cache path\n```\n\n## How build the project manually\n\narchived requires the following dependencies to build:\n\n* Go v1.22+ (prior versions not tested)\n* goreleaser v2.0+ (prior versions not tested)\n* protoc-gen-go v1.34+ (prior versions not tested)\n* protoc-gen-go-grpc v1.4 (prior versions not test)\n* docker (to build container images, run some tests)\n\nTo build the project just:\n\n```shell\ngo generate ./...\ngoreleaser build --snapshot --clean\n```\n\nTo build container images:\n\n```shell\ndocker-compose build\n```\n\nor build them manually by running:\n\n```shell\ndocker build -f dockerfiles/Dockerfile.component .\n```\n\nWhere component is one of publisher, manager, migrator, etc.\n\n## Local development\n\nIn some cases it's nice and clean to run the while stack locally.\narchived has `docker-compose` way to do that from prebuilt images:\n\n```shell\ndocker-compose up\n```\n\nor by running custom build:\n\n```shell\ngo generate -v ./... \u0026\u0026 \\\ngoreleaser build --snapshot --clean \u0026\u0026 \\\ndocker-compose build \u0026\u0026 \\\ndocker-compose up || docker-compose down\n```\n\nPlease note `docker-compose down` at the will automatically remove\ncontainers on stop. Please remove it if you don't need such behavior.\n\n## Run tests locally\n\nSimply\n\n```shell\ngo test ./...\n```\n\nPlease note running the tests will required docker to run since the tests are\nusing [go-docker-testsuite](https://github.com/teran/go-docker-testsuite)\nto run components dependencies in tests like PostgreSQL or memcached.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fteran%2Farchived","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fteran%2Farchived","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fteran%2Farchived/lists"}