Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dennwc/cas
Content Addressible Storage
https://github.com/dennwc/cas
content-addressable-storage golang
Last synced: 16 days ago
JSON representation
Content Addressible Storage
- Host: GitHub
- URL: https://github.com/dennwc/cas
- Owner: dennwc
- License: apache-2.0
- Created: 2018-09-01T11:47:57.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2024-03-18T15:33:57.000Z (8 months ago)
- Last Synced: 2024-10-14T12:11:43.403Z (about 1 month ago)
- Topics: content-addressable-storage, golang
- Language: Go
- Size: 149 KB
- Stars: 41
- Watchers: 5
- Forks: 3
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Content Addressable Storage
[![Go Reference](https://pkg.go.dev/badge/github.com/dennwc/cas.svg)](https://pkg.go.dev/github.com/dennwc/cas)
[![Join the chat at https://app.gitter.im/#/room/#dennwc_cas:gitter.im](https://badges.gitter.im/dennwc/cas.svg)](https://app.gitter.im/#/room/#dennwc_cas:gitter.im)This project implements a simple and pragmatic approach to Content Addressable Storage (CAS).
It was heavily influenced by [Perkeep](https://perkeep.org/) (aka Camlistore) and Git.For more details, see [concepts](./docs/concepts.md) and [comparison](./docs/comparison.md) with other systems.
## Status
The project is stable, and further work is ongoing on designing CAS2 - more flexible and performant version.
This project will receive bug fixed and maintenance work. New features will likely end up in CAS2.Check the [Quick start guide](./docs/quickstart.md) for a list of basic commands.
## Goals
- **Simplicity:** the core specification should be trivial to implement.
- **Interop:** CAS should play nicely with existing tools and technologies,
either content-addressable or not.- **Easy to use:** CAS should be a single command away, similar to `git init`.
## Use cases
- Immutable and versioned archives: CAS supports files with multiple
TBs of data, folders with millions of files and can index and use remote
data without storing it locally.- Data processing pipelines: CAS caching capabilities allows to use it for
incremental data pipelines.- Git for large files: CAS stores files with an assumption that they can
be multiple TBs and is optimized for this use case, while still supporting
tags and branches, like Git.## Features and the roadmap
**Implemented:**
- Fast file hashing
- SHA-256, other can be used
- Stores results in file attributes (cache)
- Support for large archives
- Large contiguous files (> TB)
- Large multipart files (> TB)
- Large directories (> millions of files)
- Zero-copy file fetch (BTRFS)
- Integrations
- Can index and sync web content
- HTTP(S) caching (as a Go library)
- Remote storage
- Self-hosted HTTP CAS server (read-only)
- Google Cloud Storage
- Usability
- Mutable objects (pins)
- Local storage in Git fashion
- Data pipelines
- Extendable
- Caches results
- Incremental**Planned (for CAS2):**
- Support for large multipart files (> TB)
- Support multilevel parts
- Support blob splitters (rolling checksum, new line, etc)
- Remote storage
- AWS, etc
- Self-hosted HTTP CAS server (read-write)
- Integration with Git
- Zero-copy fetch from Git (either remote or local)
- LFS integration
- Integration with Docker
- Zero-copy fetch of an image from Docker
- Unpack FS images to CAS
- Use containers in pipelines
- Integration with BitTorrent:
- Store torrent files
- Download torrent data directly to CAS
- To consider: expose CAS as a peer
- Integration with other CAS systems:
- Perkeep
- Upspin
- IPFS
- Windows and OSX support
- Better support for pipelines