https://github.com/dennwc/cas

Content Addressible Storage
https://github.com/dennwc/cas

content-addressable-storage golang

Last synced: 4 months ago
JSON representation

Content Addressible Storage

Host: GitHub
URL: https://github.com/dennwc/cas
Owner: dennwc
License: apache-2.0
Created: 2018-09-01T11:47:57.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2024-03-18T15:33:57.000Z (over 1 year ago)
Last Synced: 2025-03-18T08:11:19.006Z (4 months ago)
Topics: content-addressable-storage, golang
Language: Go
Size: 149 KB
Stars: 47
Watchers: 4
Forks: 3
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Content Addressable Storage

[![Go Reference](https://pkg.go.dev/badge/github.com/dennwc/cas.svg)](https://pkg.go.dev/github.com/dennwc/cas)
[![Join the chat at https://app.gitter.im/#/room/#dennwc_cas:gitter.im](https://badges.gitter.im/dennwc/cas.svg)](https://app.gitter.im/#/room/#dennwc_cas:gitter.im)

This project implements a simple and pragmatic approach to Content Addressable Storage (CAS).
It was heavily influenced by [Perkeep](https://perkeep.org/) (aka Camlistore) and Git.

For more details, see [concepts](./docs/concepts.md) and [comparison](./docs/comparison.md) with other systems.

## Status

The project is stable, and further work is ongoing on designing CAS2 - more flexible and performant version.
This project will receive bug fixed and maintenance work. New features will likely end up in CAS2.

Check the [Quick start guide](./docs/quickstart.md) for a list of basic commands.

## Goals

- **Simplicity:** the core specification should be trivial to implement.

- **Interop:** CAS should play nicely with existing tools and technologies,
either content-addressable or not.

- **Easy to use:** CAS should be a single command away, similar to `git init`.

## Use cases

- Immutable and versioned archives: CAS supports files with multiple
TBs of data, folders with millions of files and can index and use remote
data without storing it locally.

- Data processing pipelines: CAS caching capabilities allows to use it for
incremental data pipelines.

- Git for large files: CAS stores files with an assumption that they can
be multiple TBs and is optimized for this use case, while still supporting
tags and branches, like Git.

## Features and the roadmap

**Implemented:**

- Fast file hashing
- SHA-256, other can be used
- Stores results in file attributes (cache)
- Support for large archives
- Large contiguous files (> TB)
- Large multipart files (> TB)
- Large directories (> millions of files)
- Zero-copy file fetch (BTRFS)
- Integrations
- Can index and sync web content
- HTTP(S) caching (as a Go library)
- Remote storage
- Self-hosted HTTP CAS server (read-only)
- Google Cloud Storage
- Usability
- Mutable objects (pins)
- Local storage in Git fashion
- Data pipelines
- Extendable
- Caches results
- Incremental

**Planned (for CAS2):**

- Support for large multipart files (> TB)
- Support multilevel parts
- Support blob splitters (rolling checksum, new line, etc)
- Remote storage
- AWS, etc
- Self-hosted HTTP CAS server (read-write)
- Integration with Git
- Zero-copy fetch from Git (either remote or local)
- LFS integration
- Integration with Docker
- Zero-copy fetch of an image from Docker
- Unpack FS images to CAS
- Use containers in pipelines
- Integration with BitTorrent:
- Store torrent files
- Download torrent data directly to CAS
- To consider: expose CAS as a peer
- Integration with other CAS systems:
- Perkeep
- Upspin
- IPFS
- Windows and OSX support
- Better support for pipelines

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dennwc/cas

Awesome Lists containing this project

README