{"id":13393532,"url":"https://github.com/s3git/s3git","last_synced_at":"2025-04-12T14:57:27.575Z","repository":{"id":37821652,"uuid":"52785086","full_name":"s3git/s3git","owner":"s3git","description":"s3git: git for Cloud Storage. Distributed Version Control for Data. Create decentralized and versioned repos that scale infinitely to 100s of millions of files. Clone huge PB-scale repos on your local SSD to make changes, commit and push back. Oh yeah, it dedupes too and offers directory versioning.","archived":false,"fork":false,"pushed_at":"2016-08-02T00:27:55.000Z","size":101,"stargazers_count":1457,"open_issues_count":21,"forks_count":65,"subscribers_count":51,"default_branch":"master","last_synced_at":"2025-04-10T14:12:10.485Z","etag":null,"topics":["cloud-storage","decentralized","distributed","git","version-control"],"latest_commit_sha":null,"homepage":"http://s3git.org","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/s3git.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-02-29T11:06:44.000Z","updated_at":"2025-03-09T21:35:00.000Z","dependencies_parsed_at":"2022-08-19T15:41:09.818Z","dependency_job_id":null,"html_url":"https://github.com/s3git/s3git","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s3git%2Fs3git","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s3git%2Fs3git/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s3git%2Fs3git/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s3git%2Fs3git/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/s3git","download_url":"https://codeload.github.com/s3git/s3git/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248586250,"owners_count":21128997,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cloud-storage","decentralized","distributed","git","version-control"],"created_at":"2024-07-30T17:00:55.361Z","updated_at":"2025-04-12T14:57:27.546Z","avatar_url":"https://github.com/s3git.png","language":"Go","funding_links":[],"categories":["Go","Storage Server","git","Open Source Repos"],"sub_categories":["S3"],"readme":"s3git: git for Cloud Storage\u003cbr/\u003e(or Version Control for Data)\n==============================================================\n\n[![Join the chat at https://gitter.im/s3git/s3git](https://badges.gitter.im/s3git/s3git.svg)](https://gitter.im/s3git/s3git?utm_source=badge\u0026utm_medium=badge\u0026utm_campaign=pr-badge\u0026utm_content=badge)\n\ns3git applies the git philosophy to Cloud Storage. If you know git, you will know how to use s3git!\n\ns3git is a simple CLI tool that allows you to create a *distributed*, *decentralized* and *versioned* repository. It scales limitlessly to 100s of millions of files and PBs of storage and stores your data safely in S3. Yet huge repos can be cloned on the SSD of your laptop for making local changes, committing and pushing back.\n\nExactly like git, s3git does not require any server-side components, just download and run the executable. It imports the golang package [s3git-go](https://github.com/s3git/s3git-go) that can be used from other applications as well. Or see the [Python module](https://github.com/s3git/s3git-py) or [Ruby gem](https://github.com/s3git/s3git-rb).\n\nUse cases for s3git\n-------------------\n\n- Build and Release Management (see [example](https://github.com/s3git/s3git/blob/master/BINARY-RELEASE-MANAGEMENT.md) with all Kubernetes releases).\n- DevOps Scenarios\n- Data Consolidation\n- Analytics\n- Photo and Video storage\n\nSee [use cases](https://github.com/s3git/s3git/blob/master/USECASES.md) for a detailed description of these use cases.\n\nDownload binaries\n-----------------\n\n**DISCLAIMER: These are PRE-RELEASE binaries -- use at your own peril for now**\n\n### OSX\n\nDownload `s3git` from [https://github.com/s3git/s3git/releases/download/v0.9.2/s3git-darwin-amd64](https://github.com/s3git/s3git/releases/download/v0.9.2/s3git-darwin-amd64)\n\n```sh\n$ mkdir s3git \u0026\u0026 cd s3git\n$ wget -q -O s3git https://github.com/s3git/s3git/releases/download/v0.9.2/s3git-darwin-amd64\n$ chmod +x s3git\n$ export PATH=$PATH:${PWD}   # Add current dir where s3git has been downloaded to\n$ s3git\n```\n\n### Linux\n\nDownload `s3git` from [https://github.com/s3git/s3git/releases/download/v0.9.2/s3git-linux-amd64](https://github.com/s3git/s3git/releases/download/v0.9.2/s3git-linux-amd64)\n\n```sh\n$ mkdir s3git \u0026\u0026 cd s3git\n$ wget -q -O s3git https://github.com/s3git/s3git/releases/download/v0.9.2/s3git-linux-amd64\n$ chmod +x s3git\n$ export PATH=$PATH:${PWD}   # Add current dir where s3git has been downloaded to\n$ s3git\n```\n\n### Windows\n\nDownload `s3git.exe` from [https://github.com/s3git/s3git/releases/download/v0.9.1/s3git.exe](https://github.com/s3git/s3git/releases/download/v0.9.1/s3git.exe)\n\n```\nC:\\Users\\Username\\Downloads\u003e s3git.exe\n```\n\nBuilding from source\n--------------------\n\nBuild instructions are as follows (see [install golang](https://docs.minio.io/docs/how-to-install-golang) for setting up a working golang environment):\n\n```sh\n$ go get -d github.com/s3git/s3git\n$ cd $GOPATH/src/github.com/s3git/s3git \n$ go install\n$ s3git\n```\n\nBLAKE2 Tree Hashing and Storage Format\n--------------------------------------\n\nRead [here](https://github.com/s3git/s3git/blob/master/BLAKE2.md) how s3git uses the BLAKE2 Tree hashing mode for both [deduplicated](https://github.com/s3git/s3git/blob/master/BLAKE2.md#deduplicated) and [hydrated](https://github.com/s3git/s3git/blob/master/BLAKE2.md#hydrated) storage (and [here](https://github.com/s3git/s3git/blob/master/BLAKE2-and-Scalability.md) for info for BLAKE2 at scale).\n\nExample workflow\n----------------\n\nHere is a simple workflow to create a new repository and populate it with some data:\n```sh\n$ mkdir s3git-repo \u0026\u0026 cd s3git-repo\n$ s3git init\nInitialized empty s3git repository in ...\n$ # Just stream in some text\n$ echo \"hello s3git\" | s3git add\nAdded: 18e622875a89cede0d7019b2c8afecf8928c21eac18ec51e38a8e6b829b82c3ef306dec34227929fa77b1c7c329b3d4e50ed9e72dc4dc885be0932d3f28d7053\n$ # Add some more files\n$ s3git add \"*.mp4\"\n$ # Commit and log\n$ s3git commit -m \"My first commit\"\n$ s3git log --pretty\n```\n\nPush to cloud storage\n---------------------\n\n```sh\n$ # Add remote back end and push to it\n$ s3git remote add \"primary\" -r s3://s3git-playground -a \"AKIAJYNT4FCBFWDQPERQ\" -s \"OVcWH7ZREUGhZJJAqMq4GVaKDKGW6XyKl80qYvkW\"\n$ s3git push\n$ # Read back content\n$ s3git cat 18e6\nhello s3git\n```\n\n_Note: Do not store any important info in the s3git-playground bucket. It will be auto-deleted within 24-hours._\n \nDirectory versioning\n--------------------\n\nYou can also use s3git for directory versioning. This allows you to 'capture' changes coherently all the way down from a directory and subsequently go back to previous versions of the *full state of the directory* (and not just any file). Think of it as a Time Machine for directories instead of individual files.\n\nSo instead of 'saving' a directory by making a full copy into 'MyFolder-v2' (and 'MyFolder-v3', etc.) you capture the state of a directory and give it a meaningful message (\"Changed color to red\") as version so it is always easy to go back to the version you are looking for.\n\nIn addition you can discard any uncommitted changes that you made and go back to the last version that you have captured, which basically means you can (after committing) mess around in a directory and then be rest assured that you can always go back to its original state.\n\nIf you push your repository into the cloud then you will have an automatic backup and additionally you can easily collaborate with other people.\n\nLastly, it works of course with huge binary data too, so not just for text files as in the following 'demo' example:\n\n```sh\n$ mkdir dir-versioning \u0026\u0026 cd dir-versioning\n$ s3git init .\n$ # Just create a single file\n$ echo \"First line\" \u003e text.txt \u0026\u0026 ls -l\n-rw-rw-r-- 1 ec2-user ec2-user 11 May 25 09:06 text.txt\n$ #\n$ # Create initial snapshot\n$ s3git snapshot create -m \"Initial snapshot\" .\n$ # Add new line to initial file and create another file\n$ echo \"Second line\" \u003e\u003e text.txt \u0026\u0026 echo \"Another file\" \u003e text2.txt \u0026\u0026 ls -l\n-rw-rw-r-- 1 ec2-user ec2-user 23 May 25 09:08 text.txt\n-rw-rw-r-- 1 ec2-user ec2-user 13 May 25 09:08 text2.txt\n$ s3git snapshot status .\n     New: /home/ec2-user/dir-versioning/text2.txt\nModified: /home/ec2-user/dir-versioning/text.txt\n$ #\n$ # Create second snapshot\n$ s3git snapshot create -m \"Second snapshot\" .\n$ s3git log --pretty\n3a4c3466264904fed3d52a1744fb1865b21beae1a79e374660aa231e889de41191009afb4795b61fdba9c156 Second snapshot\n77a8e169853a7480c9a738c293478c9923532f56fcd02e3276142a1a29ac7f0006b5dff65d5ca245255f09fa Initial snapshot\n$ more text.txt\nFirst line\nSecond line\n$ more text2.txt\nAnother file\n$ #\n$ # Go back one version in time\n$ s3git snapshot checkout . HEAD^\n$ more text.txt\nFirst line\n$ more text2.txt\ntext2.txt: No such file or directory\n$ #\n$ # Switch back to latest revision\n$ s3git snapshot checkout .\n$ more text2.txt\nAnother file\n```\n\nNote that snapshotting works for all files in the directory including any subdirectories. Click the following link for a more elaborate repository that includes all releases of the [Kubernetes](https://github.com/s3git/s3git/blob/master/BINARY-RELEASE-MANAGEMENT.md) project.\n\nClone the YFCC100M dataset\n--------------------------\n\nClone a large repo with 100 million files totaling 11.5 TB in size ([Multimedia Commons](http://aws.amazon.com/public-data-sets/multimedia-commons/)), yet requiring only 7 GB local disk space.\n\n_(Note that this takes about **7 minutes** on an SSD-equipped MacBook Pro with 500 Mbit/s download connection so for less powerful hardware you may want to skip to the next section (or if you lack 7 GB local disk space, try a `df -h .` first). Then again it is quite a few files...)_\n\n```sh\n$ s3git clone s3://s3git-100m -a \"AKIAI26TSIF6JIMMDSPQ\" -s \"5NvshAhI0KMz5Gbqkp7WNqXYlnjBjkf9IaJD75x7\"\nCloning into ...\nDone. Totaling 97,974,749 objects.\n$ cd s3git-100m\n$ # List all files starting with '123456'\n$ s3git ls 123456\n12345649755b9f489df2470838a76c9df1d4ee85e864b15cf328441bd12fdfc23d5b95f8abffb9406f4cdf05306b082d3773f0f05090766272e2e8c8b8df5997\n123456629a711c83c28dc63f0bc77ca597c695a19e498334a68e4236db18df84a2cdd964180ab2fcf04cbacd0f26eb345e09e6f9c6957a8fb069d558cadf287e\n123456675eaecb4a2984f2849d3b8c53e55dd76102a2093cbca3e61668a3dd4e8f148a32c41235ab01e70003d4262ead484d9158803a1f8d74e6acad37a7a296\n123456e6c21c054744742d482960353f586e16d33384f7c42373b908f7a7bd08b18768d429e01a0070fadc2c037ef83eef27453fc96d1625e704dd62931be2d1\n$ s3git cat cafebad \u003e olympic.jpg\n$ # List and count total nr of files\n$ s3git ls | wc -l\n97974749\n```\n\nFork that repo\n--------------\n\nBelow is an example for `alice` and `bob` working together on a repository.\n\n```sh\n$ mkdir alice \u0026\u0026 cd alice\nalice $ s3git clone s3://s3git-spoon-knife -a \"AKIAJYNT4FCBFWDQPERQ\" -s \"OVcWH7ZREUGhZJJAqMq4GVaKDKGW6XyKl80qYvkW\"\nCloning into .../alice/s3git-spoon-knife\nDone. Totaling 0 objects.\nalice $ cd s3git-spoon-knife\nalice $ # add a file filled with zeros\nalice $ dd if=/dev/zero count=1 | s3git add\nAdded: 3ad6df690177a56092cb1ac7e9690dcabcac23cf10fee594030c7075ccd9c5e38adbaf58103cf573b156d114452b94aa79b980d9413331e22a8c95aa6fb60f4e\nalice $ # add 9 more files (with random content)\nalice $ for n in {1..9}; do dd if=/dev/urandom count=1 | s3git add; done\nalice $ # commit\nalice $ s3git commit -m \"Commit from alice\"\nalice $ # and push\nalice $ s3git push\n```\n\nClone it again as `bob` on a different computer/different directory/different universe:\n \n```sh\n$ mkdir bob \u0026\u0026 cd bob\nbob $ s3git clone s3://s3git-spoon-knife -a \"AKIAJYNT4FCBFWDQPERQ\" -s \"OVcWH7ZREUGhZJJAqMq4GVaKDKGW6XyKl80qYvkW\"\nCloning into .../bob/s3git-spoon-knife\nDone. Totaling 10 objects.\nbob $ cd s3git-spoon-knife\nbob $ # Check if we can access our empty file\nbob $ s3git cat 3ad6 | hexdump\n00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00\n*\n00000200\nbob $ # add another 10 files\nbob $ for n in {1..10}; do dd if=/dev/urandom count=1 | s3git add; done\nbob $ # commit\nbob $ s3git commit -m \"Commit from bob\"\nbob $ # and push back\nbob $ s3git push\n```\n\nSwitch back to `alice` again to pull the new content:\n\n```sh\nalice $ s3git pull\nDone. Totaling 20 objects.\nalice $ s3git log --pretty\n3f67a4789e2a820546745c6fa40307aa490b7167f7de770f118900a28e6afe8d3c3ec8d170a19977cf415d6b6c5acb78d7595c825b39f7c8b20b471a84cfbee0 Commit from bob\na48cf36af2211e350ec2b05c98e9e3e63439acd1e9e01a8cb2b46e0e0d65f1625239bd1f89ab33771c485f3e6f1d67f119566523a1034e06adc89408a74c4bb3 Commit from alice\n```\n\n_Note: Do not store any important info in the s3git-spoon-knife bucket. It will be auto-deleted within 24-hours._\n\nHere is an nice screen recording:  \n\n[![asciicast](https://asciinema.org/a/40210.png)](https://asciinema.org/a/40210)\n\nHappy forking!\n\nYou may be wondering about concurrent behaviour from \n\nIntegration with Minio\n----------------------\n\nInstead of S3 you can happily use the [Minio](https://github.com/minio/minio) server, for example the public server at https://play.minio.io:9000. Just make sure you have a bucket created using [mc](https://github.com/minio/mc) (example below uses `s3git-test`):\n\n```sh\n$ mkdir minio-test \u0026\u0026 cd minio-test\n$ s3git init \n$ s3git remote add \"primary\" -r s3://s3git-test -a \"Q3AM3UQ867SPQQA43P2F\" -s \"zuf+tfteSlswRu7BJ86wekitnifILbZam1KYY3TG\" -e \"https://play.minio.io:9000\"\n$ echo \"hello minio\" | s3git add\nAdded: c7bb516db796df8dcc824aec05db911031ab3ac1e5ff847838065eeeb52d4410b4d57f8df2e55d14af0b7b1d28362de1176cd51892d7cbcaaefb2cd3f616342f\n$ s3git commit -m \"Commit for minio test\"\n$ s3git push\nPushing 1 / 1 [==============================================================================================================================] 100.00 % 0\n```\n\nand clone it \n\n```sh\n$ s3git clone s3://s3git-test -a \"Q3AM3UQ867SPQQA43P2F\" -s \"zuf+tfteSlswRu7BJ86wekitnifILbZam1KYY3TG\" -e \"https://play.minio.io:9000\"\nCloning into .../s3git-test\nDone. Totaling 1 object.\n$ cd s3git-test/\n$ s3git ls\nc7bb516db796df8dcc824aec05db911031ab3ac1e5ff847838065eeeb52d4410b4d57f8df2e55d14af0b7b1d28362de1176cd51892d7cbcaaefb2cd3f616342f\n$ s3git cat c7bb\nhello minio\n$ s3git log --pretty\n6eb708ec7dfd75d9d6a063e2febf16bab3c7a163e203fc677c8a9178889bac012d6b3fcda56b1eb160b1be7fa56eb08985422ed879f220d42a0e6ec80c5735ea Commit for minio test\n```\n\nContributions\n-------------\n\nContributions are welcome! Please see [`CONTRIBUTING.md`](CONTRIBUTING.md).\n\nKey features\n------------\n\n * **Easy:** Use a workflow and syntax that you already know and love\n\n * **Fast:** Lightning fast operation, especially on large files and huge repositories\n\n * **Infinite scalability:** Stop worrying about maximum repository sizes and have the ability to grow indefinitely\n\n * **Work from local SSD:** Make a huge cloud disk appear like a local drive\n\n * **Instant sync:** Push local changes and pull down instantly on other clones\n\n * **Versioning:** Keep previous versions safe and have the ability to undo or go back in time\n\n * **Forking:** Ability to make many variants by forking\n\n * **Verifiable:** Be sure that you have everything and be tamper-proof (“data has not been messed with”)\n\n * **Deduplication:** Do not store the same data twice\n\n * **Simplicity:** Simple by design and provide one way to accomplish tasks\n\nCommand Line Help\n-----------------\n\n```\n$ s3git help\ns3git applies the git philosophy to Cloud Storage. If you know git, you will know how to use s3git.\n\ns3git is a simple CLI tool that allows you to create a distributed, decentralized and versioned repository.\nIt scales limitlessly to 100s of millions of files and PBs of storage and stores your data safely in S3.\nYet huge repos can be cloned on the SSD of your laptop for making local changes, committing and pushing back.\n\nUsage:\n  s3git [command]\n\nAvailable Commands:\n  add         Add stream or file(s) to the repository\n  cat         Read a file from the repository\n  clone       Clone a repository into a new directory\n  commit      Commit the changes in the repository\n  init        Create an empty repository\n  log         Show commit log\n  ls          List files in the repository\n  pull        Update local repository\n  push        Update remote repositories\n  remote      Manage remote repositories\n  snapshot    Manage snapshots\n  status      Show changes in repository\n\nFlags:\n  -h, --help[=false]: help for s3git\n\nUse \"s3git [command] --help\" for more information about a command.\n```\n\nLicense\n-------\n\ns3git is released under the Apache License v2.0. You can find the complete text in the file LICENSE.\n\nFAQ\n---\n\n**Q** Is s3git compatible to git at the binary level?  \n**A** No. git is optimized for text content with very nice and powerful diffing and using compressed storage whereas s3git is more focused on large repos with primarily non-text blobs backed up by cloud storage like S3.  \n**Q** Do you support encryption?  \n**A** No. However it is trivial to encrypt data before streaming into `s3git add`, eg pipe it through `openssl enc` or similar.  \n**Q** Do you support zipping?  \n**A** No. Again it is trivial to zip it before streaming into `s3git add`, eg pipe it through `zip -r - .` or similar.  \n**Q** Why don't you provide a FUSE interface?  \n**A** Supporting FUSE would mean introducing a lot of complexity related to POSIX which we would rather avoid.  \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs3git%2Fs3git","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fs3git%2Fs3git","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs3git%2Fs3git/lists"}