{"id":17152591,"url":"https://github.com/mkmik/imsy","last_synced_at":"2025-06-17T18:37:31.140Z","repository":{"id":45948114,"uuid":"187934150","full_name":"mkmik/imsy","owner":"mkmik","description":"simple incremental pull of immutable large files","archived":false,"fork":false,"pushed_at":"2022-06-21T08:04:32.000Z","size":18,"stargazers_count":10,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-05-27T13:54:59.053Z","etag":null,"topics":["cdc","incremental","large-files","rabin-fingerprint","replication"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mkmik.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-22T00:36:42.000Z","updated_at":"2024-11-02T20:01:05.000Z","dependencies_parsed_at":"2022-09-20T12:44:30.188Z","dependency_job_id":null,"html_url":"https://github.com/mkmik/imsy","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mkmik/imsy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mkmik%2Fimsy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mkmik%2Fimsy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mkmik%2Fimsy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mkmik%2Fimsy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mkmik","download_url":"https://codeload.github.com/mkmik/imsy/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mkmik%2Fimsy/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260419690,"owners_count":23006331,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cdc","incremental","large-files","rabin-fingerprint","replication"],"created_at":"2024-10-14T21:44:02.217Z","updated_at":"2025-06-17T18:37:26.109Z","avatar_url":"https://github.com/mkmik.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# imsy\n\n## overview\n\n`imsy` shows the underlying principle of file replication\nmechanism suitable for large immutable files, such as VM images.\n\nThe core idea is to split a file in chunks using a Content Defined Chunking (CDC) mechanism,\nand save chunks in a Content Addressed Store (CAS), where each chunk is identified by its hash (e.g. SHA256)\n\nThe file can now be fully recovered by knowing the list of hashes of its constituent chunks, in order.\n\n## usage\n\nFirst check out and run `go build`.\n\nThen get hold of a couple of big files that are different but related, e.g. two VM images. Squashed uncompressed docker images would work too. Let's call them `vm1.img` and `vm2.img`.\n\nThen run:\n\n```\n$ imsy -dir server1data prepare \u003cvm1.img\n72d21dabc5a57782eaad5745f968a58cd3f029c00897a4da5688e795256dac50\n```\n\nThis will fill `server1data` with chunks of `vm1.img`.\n\nNow, we want show how to pull this VM from another machine .\n\nOn the first machine, run:\n\n```\n$ imsy -dir server1data serve\n```\n\nOn the second machine (it works also on the same machine via localhost):\n\n```\n$ imsy -dir server2data -o vm1.img pull 72d21dabc5a57782eaad5745f968a58cd3f029c00897a4da5688e795256dac50\n```\n\nSince `server2data` on the second machine is empty, this will basically just pull the whole image.\nIt would have been cheaper to just serve the whole file via HTTP.\n\nBut now, on the first machine, you can run:\n\n```\n$ imsy -dir server2data prepare \u003cvm2.img\nff62efe2d8f6c4a3b488fafe9ec9046dee5d2fab5b0a5488506bb3af766eacff\n```\n\nWhen you pull that image on the second machine, you'll notice that only a small number of chunks gets actually downloaded (look at the `imsy serve` log)\n\n```\n$ imsy -dir server2data pull 72d21dabc5a57782eaad5745f968a58cd3f029c00897a4da5688e795256dac50\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmkmik%2Fimsy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmkmik%2Fimsy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmkmik%2Fimsy/lists"}